Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility for normalization by the number of FEL pulses: #101

Closed
yacremann opened this issue Feb 7, 2023 · 11 comments · Fixed by #116
Closed

Possibility for normalization by the number of FEL pulses: #101

yacremann opened this issue Feb 7, 2023 · 11 comments · Fixed by #116

Comments

@yacremann
Copy link

We need a possibility to normalize the binned signal by the number of FEL pulses or the intensity monitor of the free electron laser.

Preferred solution:
Please add a way to specify that the SED processor can be operated in the per-electron mode (standard), or a new "per-pulse"-mode. This way, we could instantiate a second processor in per-pulse-mode, which can be used for normalization.

We also considered to include an additional 'per pulse'-dataframe in SED, but this would significantly alter the structure of the code and will reduce the performance if normalization is not needed.

@rettigl
Copy link
Member

rettigl commented Feb 8, 2023

I'm not 100% sure if that works, but it appears to me that you can select the index via the config parameter dict "channels"

self.all_channels: dict = self._config.get("channels", {})

It should be straight forward to create two instances with different configs

@steinnymir
Copy link
Member

@yacremann I believe you would want the old ddMicrobunches table, right?
If I am not mistaken, this can be obtained from the current dataframe by droppping the electron index, and leaving only the per-pulse channels, as @rettigl correctly suggests
I can look into making a simpler access to generating this, as is it indeed useful to normalize FEL intensity, but also delay stage positions (less crucial with the current laser system).

@rettigl
Copy link
Member

rettigl commented Feb 10, 2023

I was also thinking about this in the context of lab experiments. In principle, such a normalization histogram could be derived from a per-electron dataframe, if every electron has a timestamp, which could be generated for the FLASH data, and is already implemented for our data.

@steinnymir
Copy link
Member

timestamp is already included with flash data.
However, there is a column called pulseId that tracks the id of the pulse in a train. This can be easily used to filter out a table indexed only on pulses, i.e. FEL shots

@zain-sohail
Copy link
Member

We can provide a notebook with said idea of filtering, or it can be part of the workflow within the context of workflow manager (makes more sense).

@steinnymir
Copy link
Member

a notebook or example of the exact use case for the required feature is always greatly appreciated!

If I remember correctly, the most use of the per-pulse dataframe consisted in binning 1D traces in parallel to the final binning array to use as normalization arrays. This should be easy to implement in the workflow-manager/processor class.

@rettigl
Copy link
Member

rettigl commented Feb 14, 2023

I have thought about this a bit. In principle, maintaining two dataframes seems a bit redundant and a good design choice. On the other hand, deriving e.g. a "time per electron" column out of time stamps or bunch numbers probably creates a substantial amount of overhead, so maybe the former approach might perform much better...

@yacremann
Copy link
Author

yacremann commented Feb 15, 2023

I am also not sure that generating a per-pulse dataframe from the per-electron dataframe is efficient. In addition, it will be necessary to remove duplicates, which is likely slow.
Also, I guess this is a functionality which ideally does not require the user to access the ("private") dataframe directly. I think this should be a functionality accessible from the processor class.

About normalization: Sometimes we will need to just normalize against the number of EL pulses, sometimes it will be necessary to normalize against pulse energy measurement devices (the normalization is then also done per bin).

From this point-of-view, I still think the most flexible and simplest solution will be to use two config files and generate a per-electron and per-pulse processor.

@yacremann
Copy link
Author

There is an additional reason why just indexing out a new table is not perfect: If there is an FEL pulse which did not generate electrons on the detector, it will not show up on the table organized by detected electrons. I agree that this is often a small error, but still...

Normalization may not be essential in a laboratory setup, but is very important at the FEL.

The easiest would be to add a way to tell the reader that we want a processor with a table by FEL pulse (instead by electron). This processor can be used for normalization.

@rettigl
Copy link
Member

rettigl commented May 26, 2023

PR #116 implements normalization histograms both from a timed (aka per shot) dataframe, or alternatively from a timestamp column.
Regarding the last reason you bring: Even shots that don't provide electrons should be propperly normalized for if timestamps are correctly applied, because then the next electron will just get assigned a longer time.

@rettigl
Copy link
Member

rettigl commented Jun 19, 2023

@zainsohail04 This could now be implemented for the FLASH loader in PR#116

@steinnymir steinnymir added this to the FLASH Beamtime milestone Nov 3, 2023
@steinnymir steinnymir linked a pull request Nov 4, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants