Roadmap for the paper #19

scarrazza · 2020-04-06T11:47:12Z

Following the development, here my wish list for the paper:

create final performance benchmark plots
create accuracy benchmark plots for NNPDF and other PDF sets
prepare examples (singlet top, FK convolution)
finalize code and related tasks.

scarrazza · 2020-04-06T11:47:38Z

@marcorossi5 could you please take care of the first 2 points?

marcorossi5 · 2020-04-07T12:27:21Z

I collected a lot of accuracy plots for different pdf sets, both about central and noncentral pdfs.
As for the performance plots, I can make it just for my laptop (maybe a GPU at cern which is TeslaP100 I think). I don't have access to different devices.

scarlehoff · 2020-04-07T12:28:55Z

I can take care of the GPU performance plots, I have access to a good number of different GPUs.

marcorossi5 · 2020-04-07T14:42:53Z

I made accuracy plots and collected in this folder to be downloaded https://cernbox.cern.ch/index.php/s/AfTAO0tMf0p3xk0
password: pdfflow
If you want to edit or upload something you should be able to do that as well

scarrazza · 2020-04-09T19:11:21Z

I have place a pair of simple scripts in the paper repo (https://github.com/N3PDF/papers/pull/8). I don't understand why the code below is extremely slow in comparison to LHAPDF, and I was just wondering if multi-replica evaluation is something that we should consider implementing:

#!/usr/bin/env python
import time
import pdfflow.pflow as pdf
import numpy as np

pdfs = [
    pdf.mkPDF(f'NNPDF31_nlo_as_0118_1000/{i}', dirname='/opt/lhapdf/share/LHAPDF/')
    for i in range(1001)
    ]
xgrids = np.loadtxt('xgrids.dat')
q2 = 1.65**2*np.ones(len(xgrids))
fls = [-5,-4,-3,-2,-1,0,1,2,3,4,5]

t0 = time.time()
for pdf in pdfs:
    pdf.xfxQ2(fls, xgrids, q2)
print('total time (s):', time.time()-t0)

marcorossi5 · 2020-04-09T21:28:38Z

Could it be because the first time you call xfxQ2, tf builds the graph? The first iteration is way slower than the others. Here you rebuild a graph for each of the 1000 pdfs

scarrazza · 2020-04-10T10:03:16Z

Indeed, here some numbers for pdfflow vs lhapdf timings on CPU. We should think if there are possibilities to improve on that:

LHAPDF

loading from file time (s): 19.299583673477173
total FK evaluation time (s): 13.704259634017944

PDFFlow

loading from file time (s): 88.5237967967987
dry run (s): 2184.7245454788208
total FK evaluation time (s): 2.3665452003479004

scarlehoff · 2020-04-10T10:21:06Z

As you said, multireplica implementation makes a lot of sense. There are some intermediate solutions, as making sure that it doesn't rebuild the graph replica to replica (checking what changes from one to the next and make it into a tensor instead of python/numpy variables).

My guess is that option 2 will make option 1 easier in the future so there's that.

marcorossi5 · 2020-04-10T10:25:54Z

Currently the graph is grid dependent. Every instance of pdf will create a new graph because of the shape of the grid. We abstracted just the shape of the query points.
We could think of abstracting the concept of operational graph and making ops independent of the grid. This way we could load a computational graph just once when a script (which wants to use pdfflow) starts and employing it everytime we call interpolations.
It's like haveing a placeholder for the grid.
I am wondering if this is closer to the concept of tf versions <=1.15 rather than @tf.function

scarlehoff · 2020-04-10T10:28:33Z

We could think of abstracting the concept of operational graph and making ops independent of the grid. This way we could load a computational graph just once when a script (which wants to use pdfflow) starts and employing it everytime we call interpolations.

Yes, this is what I was thinking. And I don't think that using a placeholder which we fill in later will change anything for one-replica calls since the grids are not that big anyway.

marcorossi5 · 2020-04-10T10:30:56Z

Can we insert a check that counts how many times the graph is being rebuilt?

scarlehoff · 2020-09-14T08:59:12Z

So, the only thing missing here is adding docs for the normal usage.
@marcorossi5, could you add a section (maybe in overview) where the different PDF routines are used? Some examples like generating points for a few flavours in a few Q^2s.
And then also a section (also in overview, maybe at the beginning) on how to install new PDFs saying there are two options, either installing directly from LHAPDF or downloading and pointing to the correct folder

(I'd do it myself but am in a -long- meeting so not sure when I'll have a time today before the paper gets sent at 8pm :P)

Also, the section "General Usage" maybe should have another name. Maybe "Advance usage"...

marcorossi5 · 2020-09-14T09:05:46Z

Do you mean in the paper? Outlook section? Or appendix?

scarlehoff · 2020-09-14T09:06:39Z

No, no, the documentation in this repository. So that before submiting the paper we generate the version 1 release of pdfflow.

marcorossi5 · 2020-09-14T09:07:02Z

Ok, I was confused

scarlehoff · 2020-09-14T09:23:27Z

Sorry, I was writing here while paying attention to the voice of people :__

marcorossi5 · 2020-09-14T09:25:25Z

Ok then, it's clear. I'm going to create a new branch from master named docs, to include some typos I found and these new features.

scarlehoff · 2020-09-18T07:23:36Z

https://arxiv.org/abs/2009.06635

Radonirinaunimi mentioned this issue Sep 1, 2020

Switch on/off LogOutput && Multi-replicas computation #33

Closed

scarlehoff closed this as completed Sep 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap for the paper #19

Roadmap for the paper #19

scarrazza commented Apr 6, 2020 •

edited by scarlehoff

Loading

scarrazza commented Apr 6, 2020

marcorossi5 commented Apr 7, 2020

scarlehoff commented Apr 7, 2020

marcorossi5 commented Apr 7, 2020 •

edited

Loading

scarrazza commented Apr 9, 2020

marcorossi5 commented Apr 9, 2020 •

edited

Loading

scarrazza commented Apr 10, 2020 •

edited

Loading

scarlehoff commented Apr 10, 2020

marcorossi5 commented Apr 10, 2020

scarlehoff commented Apr 10, 2020

marcorossi5 commented Apr 10, 2020

scarlehoff commented Sep 14, 2020

marcorossi5 commented Sep 14, 2020 •

edited

Loading

scarlehoff commented Sep 14, 2020

marcorossi5 commented Sep 14, 2020

scarlehoff commented Sep 14, 2020

marcorossi5 commented Sep 14, 2020 •

edited

Loading

scarlehoff commented Sep 18, 2020

Roadmap for the paper #19

Roadmap for the paper #19

Comments

scarrazza commented Apr 6, 2020 • edited by scarlehoff Loading

scarrazza commented Apr 6, 2020

marcorossi5 commented Apr 7, 2020

scarlehoff commented Apr 7, 2020

marcorossi5 commented Apr 7, 2020 • edited Loading

scarrazza commented Apr 9, 2020

marcorossi5 commented Apr 9, 2020 • edited Loading

scarrazza commented Apr 10, 2020 • edited Loading

scarlehoff commented Apr 10, 2020

marcorossi5 commented Apr 10, 2020

scarlehoff commented Apr 10, 2020

marcorossi5 commented Apr 10, 2020

scarlehoff commented Sep 14, 2020

marcorossi5 commented Sep 14, 2020 • edited Loading

scarlehoff commented Sep 14, 2020

marcorossi5 commented Sep 14, 2020

scarlehoff commented Sep 14, 2020

marcorossi5 commented Sep 14, 2020 • edited Loading

scarlehoff commented Sep 18, 2020

scarrazza commented Apr 6, 2020 •

edited by scarlehoff

Loading

marcorossi5 commented Apr 7, 2020 •

edited

Loading

marcorossi5 commented Apr 9, 2020 •

edited

Loading

scarrazza commented Apr 10, 2020 •

edited

Loading

marcorossi5 commented Sep 14, 2020 •

edited

Loading

marcorossi5 commented Sep 14, 2020 •

edited

Loading