-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Find datadicts matching a set of conditions #379
Feature: Find datadicts matching a set of conditions #379
Conversation
9fea278
to
a877d5c
Compare
Resolved a merge conflict with #375. |
I really like this feature! But at the moment if the search encounters any invalid data (the writer always creates a file even if the nothing is inside of it) the whole search fails. Because of this, it is hard to test on my end. I am also a little unsure if its a good idea that the search_datadicts returns the generator instead of a list with all the matching datadicts. It is a good idea to have the generator since the datadicts might be big, but having both the generators and a function that returns a list might be a good idea too and shouldn't take much effort. @wpfff what do you think? |
Thanks! I think I've resolved the error you encountered by fixing a bug in |
Added the following search conditions:
|
Hello sorry for the late response, its been a busy couple of weeks. I remember being able to test this but no matter how I try now the generator is always empty. @yoshi74ls181 could you give me an example of how it is supposed to be used? |
No worries! Sorry about flooding you with many pull requests recently, I don't mean to rush you at all. Here's a usage example: from plottr.data.datadict_storage import DataDict, DDH5Writer, search_datadicts, search_datadict
basedir = "C:\\plottr-data"
# create two datasets
data = DataDict(x=dict(), y=dict(axes=["x"]))
with DDH5Writer(data, basedir, name="test") as writer:
writer.add_data(x=[1, 2, 3], y=[1, 2, 3])
data = DataDict(x=dict(), y=dict(axes=["x"]))
with DDH5Writer(data, basedir, name="test") as writer:
writer.add_data(x=[1, 2, 3], y=[3, 2, 1])
# print all datasets named "test" from today
for foldername, datadict in search_datadicts(basedir, "2023-03-17", name="test"):
print(foldername, datadict["x"]["values"], datadict["y"]["values"])
# print just the newest one
foldername, datadict = search_datadict(basedir, "2023-03-17", name="test", newest=True)
print(foldername, datadict["x"]["values"], datadict["y"]["values"])
# print the one with specific date and time
foldername, datadict = search_datadict(basedir, "2023-03-17T200540", name="test")
print(foldername, datadict["x"]["values"], datadict["y"]["values"]) |
@yoshi74ls181 off-topic, but i couldn't find a way to message you in a different way :) |
@wpfff Have you received my email? I'm worried that it might have ended up in your spam folder because I sent it from my personal gmail account (I lost access to my university email when I graduated). No worries if it's just that you've been busy. |
this function is useful, and we have a similar one in our lab code -- but i'm not sure it should be part of plottr itself.
we're currently thinking on how to filter better in monitr, but we're not sure yet on the correct approach. |
This pull request adds a method
plottr.data.datadict_storage.search_datadicts
, which returns an iterator over datadicts matching a set of conditions.The following conditions are currently supported:
since
: Date (and time) in the formatYYYY-mm-dd
(orYYYY-mm-ddTHHMMSS
).until
: Date (and time) in the formatYYYY-mm-dd
(orYYYY-mm-ddTHHMMSS
). If not given, default tountil = since
.name
: Name of the dataset (if not given, match all datasets).For convenience, I've also added a method
plottr.data.datadict_storage.search_datadict
, which asserts that there is only one matching datadict.