Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-43297: Update streaming sequence finder for new file layout #34

Merged
merged 6 commits into from
Mar 13, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
22 changes: 15 additions & 7 deletions python/lsst/summit/extras/fastStarTrackerAnalysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,21 @@ def getStreamingSequences(dayObs):
print(f"Found {len(regularFiles)} regular files on dayObs {dayObs}")

data = {}
for filename in sorted(streamingFiles):
basename = os.path.basename(filename)
seqNum = int(basename.split("_")[3])
if seqNum not in data:
data[seqNum] = [filename]
else:
data[seqNum].append(filename)
if dayObs < 20240311:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick comment as to the significance of this date would be nice (even if just for future you?!)

for filename in sorted(streamingFiles):
basename = os.path.basename(filename)
seqNum = int(basename.split("_")[3])
if seqNum not in data:
data[seqNum] = [filename]
else:
data[seqNum].append(filename)
else:
# dirs here isn't the fully dirname, it's just the base dirname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"fully dirname"?!

So...call it baseDirs?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I've gone for dirNames and also improved the comment (which also contained a typo).

dirs = sorted(d for d in os.listdir(dataDir) if os.path.isdir(os.path.join(dataDir, d)))
for d in dirs:
files = sorted(glob.glob(os.path.join(dataDir, d, "*.fits")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take your word for it that you really want to list all the .fits files 😀

Copy link
Collaborator

@timj timj Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using ResourcePath.findFileResources to find them (with the advantage that it works on object stores as well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do 🙂 Also, all this will be in the butler soon (I am told) so this is almost throw-away code (and that's also why it's in summit_extras)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm never going to look in S3 for this, and like you say, one day this will just be a butler.get for a single object.

seqNum = int(d.split("_")[3])
data[seqNum] = files

print(f"Found {len(data)} streaming sequences on dayObs {dayObs}:")
for seqNum, files in data.items():
Expand Down