real-time proc codes and recipe #341

ahmadtourei · 2024-01-10T01:31:18Z

Description

This PR provides codes and a demonstration of real-time processing.

Checklist

I have (if applicable):

referenced the GitHub issue this PR closes.
documented the new feature with docstrings or appropriate doc page.
included a test. See testing guidelines.
your name has been added to the contributors page (docs/contributors.md).
added the "ready_for_review" tag once the PR is ready to be reviewed.

d-chambers

A few suggestions on improve conciseness but looks good otherwise.

d-chambers · 2024-01-10T22:20:07Z

docs/recipes/real_time_proc.qmd

+
+## Set real-time processing parameters (if needed)
+
+In this section, we define the window size and step size required for [rolling](https://dascore.org/api/dascore/proc/rolling/rolling.html) mean processing. With a sampling interval of 10 seconds, the cutoff frequency (Nyquist frequency) is determined to be 0.05 Hz. Additionally, we establish the desired wait time after each run by using the `sleep_time_mult` parameter, which acts as a multiplier coefficient for the number of seconds in each patch.


The link should use DASCore's internal linking:

[rolling](`dascore.proc.rolling.rolling`)

That way each doc build will link to its own version of the rolling page rather than pointing to this static URL.

d-chambers · 2024-01-10T22:21:31Z

docs/recipes/real_time_proc.qmd

+---
+
+
+This recipe serves as an example to showcase the real-time processing capability of DASCore. Here, we demonstrate how to use DASCore to perform rolling mean processing on a spool in real time for edge computing purposes.


I suggest using the term "near real-time batch processing". True real-time usually implies some kind of high frequency streaming.

d-chambers · 2024-01-10T22:23:12Z

docs/recipes/real_time_proc.qmd

+    run_num = i+1
+    print(f"\nRun number: {run_num}")
+
+    # Select a updated sub-spool


I think you need "Select an updated sub-spool"

d-chambers · 2024-01-10T22:24:41Z

docs/recipes/real_time_proc.qmd

+    print(f"\nRun number: {run_num}")
+
+    # Select a updated sub-spool
+    sp = dc.spool(data_path).sort("time").update()


I am not 100% sure, but I don't think update will preserve the sorting, so I suggest you change to:

sp = dc.spool(data_path).update().sort("time")

Also, any reason not to define the spool outside of the while loop so we don't have to init it each iteration?

d-chambers · 2024-01-10T22:26:22Z

docs/recipes/real_time_proc.qmd

+    len_updated_sp = len(sp)
+
+    # Get number of seconds in the first patch 
+    sampling_interval = sp[0].attrs['d_time']


We are trying to move away from using attrs to get coord info. I suggest:

sp[0].coords.step("time")

Oops I forgot to change this from my old codes.

d-chambers · 2024-01-10T22:27:29Z

docs/recipes/real_time_proc.qmd

+
+    # Get number of seconds in the first patch 
+    sampling_interval = sp[0].attrs['d_time']
+    num_sec = len(sp[0].coords["time"]) * sampling_interval 


maybe more clear as:

num_sec = sp[0].coords.max("time") - sp[0].coords.min("time")

Then we don't need sampling_interval in the previous line.

So, actually:

num_sec = sp[0].coords.max("time") - sp[0].coords.min("time") + sampling_interval

since sp[0].coords.min("time") is equal to patch's start time + sampling_interval

d-chambers · 2024-01-10T22:29:03Z

docs/recipes/real_time_proc.qmd

+    num_sec = len(sp[0].coords["time"]) * sampling_interval 
+
+    # Set sleep time after each run to the 
+    sleep_time = num_sec * sleep_time_mult


Good idea on the sleep time being a multiple of the patch time duration.

d-chambers · 2024-01-10T22:31:25Z

docs/recipes/real_time_proc.qmd

+        # Sleep longer
+        time.sleep(4*num_sec)    
+
+        # Check whether new data was detected in the spool
+        sp = dc.spool(data_path).sort("time").update()
+        len_updated_sp = len(sp)
+
+        if len_last_sp == len_updated_sp:
+            print(f"No new data was detected in spool even after "
+                "four times of patch time, {4*num_sec} sec. "
+                "Real-time data processing ended successfully.")
+            break


Since the example is getting a bit long already, maybe remove this and just change sleep_time_mult to 3? Then update the print statement and just break like it was before.

d-chambers · 2024-01-10T22:33:35Z

docs/recipes/real_time_proc.qmd

+    initial_run = (i == 0)
+    run_num = i+1


Can we just set i = 1 and

initial_run = (i==1)

so we can remove line 58?

d-chambers · 2024-01-11T00:08:23Z

Also, probably goes without saying, but before merging, it would be good to manually run the code to make sure it works 😄. With the exec: false setting the doc build won't actually verify that for us.

ahmadtourei · 2024-01-11T21:16:43Z

Also, probably goes without saying, but before merging, it would be good to manually run the code to make sure it works 😄. With the exec: false setting the doc build won't actually verify that for us.

Thank you for the comments! Please feel free to re-open if any further improvement is needed.

real-time proc codes and recipe

ba02534

ahmadtourei added the ready_for_review PR is ready for review label Jan 10, 2024

ahmadtourei and others added 2 commits January 9, 2024 18:33

Delete benchmarks/notebooks/rolling_real_time.ipynb

078ff3e

improved docs

61dc058

d-chambers approved these changes Jan 10, 2024

View reviewed changes

tested and addressed review

8df2e1c

ahmadtourei merged commit 8390f0f into master Jan 11, 2024
1 check passed

ahmadtourei deleted the real_time_proc branch January 11, 2024 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

real-time proc codes and recipe #341

real-time proc codes and recipe #341

ahmadtourei commented Jan 10, 2024

d-chambers left a comment

d-chambers Jan 10, 2024

d-chambers Jan 10, 2024

d-chambers Jan 10, 2024

d-chambers Jan 10, 2024

d-chambers Jan 10, 2024

ahmadtourei Jan 11, 2024

d-chambers Jan 10, 2024

ahmadtourei Jan 11, 2024

d-chambers Jan 10, 2024

d-chambers Jan 10, 2024

d-chambers Jan 10, 2024

d-chambers commented Jan 11, 2024

ahmadtourei commented Jan 11, 2024


		## Set real-time processing parameters (if needed)

		In this section, we define the window size and step size required for [rolling](https://dascore.org/api/dascore/proc/rolling/rolling.html) mean processing. With a sampling interval of 10 seconds, the cutoff frequency (Nyquist frequency) is determined to be 0.05 Hz. Additionally, we establish the desired wait time after each run by using the `sleep_time_mult` parameter, which acts as a multiplier coefficient for the number of seconds in each patch.

		---


		This recipe serves as an example to showcase the real-time processing capability of DASCore. Here, we demonstrate how to use DASCore to perform rolling mean processing on a spool in real time for edge computing purposes.

real-time proc codes and recipe #341

real-time proc codes and recipe #341

Conversation

ahmadtourei commented Jan 10, 2024

Description

Checklist

d-chambers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

d-chambers commented Jan 11, 2024

ahmadtourei commented Jan 11, 2024