-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
real-time proc codes and recipe #341
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions on improve conciseness but looks good otherwise.
docs/recipes/real_time_proc.qmd
Outdated
|
||
## Set real-time processing parameters (if needed) | ||
|
||
In this section, we define the window size and step size required for [rolling](https://dascore.org/api/dascore/proc/rolling/rolling.html) mean processing. With a sampling interval of 10 seconds, the cutoff frequency (Nyquist frequency) is determined to be 0.05 Hz. Additionally, we establish the desired wait time after each run by using the `sleep_time_mult` parameter, which acts as a multiplier coefficient for the number of seconds in each patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link should use DASCore's internal linking:
[rolling](`dascore.proc.rolling.rolling`)
That way each doc build will link to its own version of the rolling page rather than pointing to this static URL.
docs/recipes/real_time_proc.qmd
Outdated
--- | ||
|
||
|
||
This recipe serves as an example to showcase the real-time processing capability of DASCore. Here, we demonstrate how to use DASCore to perform rolling mean processing on a spool in real time for edge computing purposes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using the term "near real-time batch processing". True real-time usually implies some kind of high frequency streaming.
docs/recipes/real_time_proc.qmd
Outdated
run_num = i+1 | ||
print(f"\nRun number: {run_num}") | ||
|
||
# Select a updated sub-spool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need "Select an updated sub-spool"
docs/recipes/real_time_proc.qmd
Outdated
print(f"\nRun number: {run_num}") | ||
|
||
# Select a updated sub-spool | ||
sp = dc.spool(data_path).sort("time").update() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure, but I don't think update will preserve the sorting, so I suggest you change to:
sp = dc.spool(data_path).update().sort("time")
Also, any reason not to define the spool outside of the while loop so we don't have to init it each iteration?
docs/recipes/real_time_proc.qmd
Outdated
len_updated_sp = len(sp) | ||
|
||
# Get number of seconds in the first patch | ||
sampling_interval = sp[0].attrs['d_time'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are trying to move away from using attrs to get coord info. I suggest:
sp[0].coords.step("time")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I forgot to change this from my old codes.
docs/recipes/real_time_proc.qmd
Outdated
|
||
# Get number of seconds in the first patch | ||
sampling_interval = sp[0].attrs['d_time'] | ||
num_sec = len(sp[0].coords["time"]) * sampling_interval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe more clear as:
num_sec = sp[0].coords.max("time") - sp[0].coords.min("time")
Then we don't need sampling_interval
in the previous line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, actually:
num_sec = sp[0].coords.max("time") - sp[0].coords.min("time") + sampling_interval
since sp[0].coords.min("time")
is equal to patch's start time + sampling_interval
docs/recipes/real_time_proc.qmd
Outdated
num_sec = len(sp[0].coords["time"]) * sampling_interval | ||
|
||
# Set sleep time after each run to the | ||
sleep_time = num_sec * sleep_time_mult |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea on the sleep time being a multiple of the patch time duration.
docs/recipes/real_time_proc.qmd
Outdated
# Sleep longer | ||
time.sleep(4*num_sec) | ||
|
||
# Check whether new data was detected in the spool | ||
sp = dc.spool(data_path).sort("time").update() | ||
len_updated_sp = len(sp) | ||
|
||
if len_last_sp == len_updated_sp: | ||
print(f"No new data was detected in spool even after " | ||
"four times of patch time, {4*num_sec} sec. " | ||
"Real-time data processing ended successfully.") | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the example is getting a bit long already, maybe remove this and just change sleep_time_mult
to 3? Then update the print statement and just break like it was before.
docs/recipes/real_time_proc.qmd
Outdated
initial_run = (i == 0) | ||
run_num = i+1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just set i = 1 and
initial_run = (i==1)
so we can remove line 58?
Also, probably goes without saying, but before merging, it would be good to manually run the code to make sure it works 😄. With the exec: false setting the doc build won't actually verify that for us. |
Thank you for the comments! Please feel free to re-open if any further improvement is needed. |
Description
This PR provides codes and a demonstration of real-time processing.
Checklist
I have (if applicable):