Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of Scan component #4

Merged
merged 2 commits into from
Aug 24, 2023
Merged

Initial implementation of Scan component #4

merged 2 commits into from
Aug 24, 2023

Conversation

JamesWrigley
Copy link
Member

Leaving this as a draft for now because I still need to write tests, but y'all are welcome to try it on real data to make sure I'm not missing something obvious 馃槄

Known issues (that may or may not be fixed in the future):

  • Automatic resolution detection on non-linear scans is dodgy.
  • The first and last detected steps may be part of the motor backlash rather than part of the scan. Haven't thought much about how, but it would be nice to automatically detect and remove them.

@JamesWrigley JamesWrigley self-assigned this Jun 13, 2023
@JamesWrigley JamesWrigley changed the title Initial implementation of Scan component Draft: Initial implementation of Scan component Jun 13, 2023
@JamesWrigley JamesWrigley changed the title Draft: Initial implementation of Scan component Initial implementation of Scan component Jun 13, 2023
@JamesWrigley JamesWrigley marked this pull request as draft June 13, 2023 18:54
@JamesWrigley JamesWrigley force-pushed the scantool branch 2 times, most recently from d3de875 to 3dacf2f Compare June 16, 2023 09:15
@JamesWrigley JamesWrigley force-pushed the scantool branch 2 times, most recently from 5b3116b to 460d717 Compare June 20, 2023 08:26
Base automatically changed from scantool to master June 21, 2023 10:45
from datetime import timedelta

import numpy as np
from xarray import DataArray
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xarray is really slow to import and can take up to 10s of seconds especially from GPFS environments. Unlike in most other places it does look like this component will always use it, but since it's imported alongside any other component, I would prefer a lazy import.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, fixed in 7f69671.

min_trains (int): The minimum number of trains per-step in the scan. It
will be guessed if not passed explicitly.
"""
self._name = "motor"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an xarray.DataArray is passed, it may be nice to set this argument manually via an optional argument or try to take it from the DataArray's name (KeyData.xarray() will set this to {source}.{key} by default).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice, didn't know about DataArray.name. Added in 7f69671.

@JamesWrigley
Copy link
Member Author

I rebased the branch, added some tests, and cleaned up the code a bit. Still not quite happy with it so I'm gonna leave it as a draft for now.

Copy link
Collaborator

@philsmt philsmt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a try for a demo in the FXE workshop, and it worked beautifully. Doing so I found a few minor usability improvements, but this component is frankly awesome.

src/extra/components/scan.py Outdated Show resolved Hide resolved
src/extra/components/scan.py Outdated Show resolved Hide resolved
src/extra/components/scan.py Outdated Show resolved Hide resolved
@JamesWrigley
Copy link
Member Author

Rebased and fixed a few things, but there's one more issue with noisy data (noticed in the FXE demo) I want to fix before this is ready for review.

@philsmt
Copy link
Collaborator

philsmt commented Jul 26, 2023

I was also wondering whether the fixtures could directly return the DataCollection 馃
@takluyver Could there be any state implications by doing so?

@JamesWrigley
Copy link
Member Author

Ok, I think this is ready for review now.

Recent changes:

  • Added an intra_step_filtering parameter for deciding how aggressively to filter trains within a step. Tweaking this fixes the issue I saw with FXE's noisy mono scan.
  • Moved the parameter docs into their individual docstrings.
  • Made __repr__() do something more conventional and recommend .info() instead to look at the scan info. Added .format() in case someone wants to save the string somewhere.

@JamesWrigley
Copy link
Member Author

(if there's no objections I'll probably merge this at the end of next week)

@philsmt
Copy link
Collaborator

philsmt commented Aug 4, 2023

LGTM!

One additional idea I got while remaking the XAS demo on top of AdqChannel and PumpProbePulses:

  • A property yielding tuples of position and extra_data.by_id slices.

@JamesWrigley
Copy link
Member Author

The problem with using slices is that they'll include any trains that are filtered out within the steps. My inclination is to stick with lists of train IDs, but then again I don't think I've seen any scans that have jumps within the steps 馃 Lemme think about that.

@philsmt
Copy link
Collaborator

philsmt commented Aug 7, 2023

Oh certainly, but I don't think it's that much of a deal, and slices would be there for convenience anyway. We could have an optional strict-style argument that throws an exception if no contiguous slices can be made. I'm curious though, what are the reasons right now a train in the middle can be filtered out?

@JamesWrigley
Copy link
Member Author

There's actually none that I can think of, the original reason I did it that way is to be on the safe side in case there was some badly behaving motor (or other device). But I've never seen it happen so it's probably safe to assume that we always want all trains within a step, or make that the default behaviour at least.

We always end up converting it to one anyway.
@JamesWrigley
Copy link
Member Author

I tried to switch to by_id by default, but that made the API a bit awkward because it's difficult to convert a by_id[] into a list of train IDs to easily find e.g. the length of a step (I know you could check the start/stop values directly but that's less convenient than len()) or pass it to DataArray.select(trainId=...) 鈽癸笍 So I stuck with a list of train IDs, but made it include all trains within a step in 90d0603 so we at least get the behaviour of by_id[], which I agree with you should be the default.

@JamesWrigley JamesWrigley merged commit b996a23 into master Aug 24, 2023
4 checks passed
@JamesWrigley JamesWrigley deleted the scan branch August 24, 2023 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants