diff --git a/api/data-elements.md b/api/data-elements.md index aa64334..110fd5b 100644 --- a/api/data-elements.md +++ b/api/data-elements.md @@ -3,4 +3,95 @@ uid: data-elements title: Data Elements --- -Data elements are produced by Bonsai operators. \ No newline at end of file +Data elements are produced by Bonsai operators. These +pages contain information data elements that can help interpret and load data +produced by . + +In general, a data element comprises of properties which together contain +timestamped data from a particular device. For example, + outputs +[Bno055DataFrames](xref:OpenEphys.Onix1.Bno055DataFrame) which contains data +produced by a BNO055 device: +- The data produced by the BNO055 is contained in the Acceleration, + Calibration, EulerAngle, Gravity, and Temperature properties of the + Bno055DataFrame. Any of these properties can be individually selected and + [visualized](xref:visualize-data) in Bonsai. +- The property contains the precise + hardware timestamp for the data in the properties described in the first + bullet point created using the global ONIX Controller clock. This Clock + property can be used to sync BNO055 data with data from all other devices + from which ONIX is acquiring and put it all onto the same timeline. +- The property contains the precise + hardware timestamp created using the clock on the hardware that contains + the device. + +There are some exceptions to the pattern described above. For example: +- is an object passed through the + configuration chain for writing to and reading from the ONIX hardware. +- outputs the parameters used to + set the precise hardware output clock when the workflow starts. + +These pages also describe the type of each property. This type information can +be used to calculate the rate of data produced by the devices enabled in your +experiment. For example, the operator +(which outputs the data from a single Neuropixels 2.0 probe device) produces a +sequence of +[NeuropixelsV2eDataFrames](xref:OpenEphys.Onix1.NeuropixelsV2eDataFrame). Using +the fact that each sample comprises of a Clock property (8 bytes), a HubClock +property (8 bytes), and an AmplifierData property (384*2 bytes), this device's +data rate is: + +$$ +\begin{equation} + \frac{2*384+8+8\,bytes}{sample}*\frac{30,000\,samples}{s}*\frac{1\,MB}{10^6bytes} = 23.52\,MB/s + \label{eq:1x_npx2_bw} +\end{equation} +$$ + +NeuropixelsV2eDataFrame is actually a buffered data frame (as indicated by the +presence of NeuropixelsV2eData's BufferSize property), meaning that several data +samples and their timestamps are buffered into a single NeuropixelsV2eDataFrame. +The above calculation was calculated under the assumption that +NeuropixelsV2eData's BufferSize property is set to 1. Although the calculation +is slightly different when BufferSize is more than 1, the end result ends up +being the same. When BufferSize is more than 1, NeuropixelsV2eDataFrames are +produced at a rate 30 kHz divided by the value of BufferSize. Each +NeuropixelsV2eDataFrame comprises of: + +- a Clock property: an array of ulong (each 8 bytes) of length N +- a HubClock property: an array of ulong (each 8 bytes) of length N +- an AmplifierData property: a of ushort (each 2 bytes) + of size 384 x N + +where N is a stand-in for BufferSize. Therefore, the calculation becomes: + +$$ +\begin{equation} + \frac{(2*384+8+8)*N\,bytes}{sample}*\frac{30,000/N\,samples}{s}*\frac{1\,MB}{10^6bytes} = 23.52\,MB/s + \label{eq:1x_npx2_bw_buffersize} +\end{equation} +$$ + +N cancels out and the result is the same. + +Knowing the type of each property can also be helpful in two more ways: + +- A property's type indicates how a property can be used in Bonsai. Operators + typically accept only a specific type or set of types as inputs. When types + don't match, Bonsai indicates an error. +- If a property is saved using a (i.e. as a raw + binary file), knowing its type informs how to load the data. For example, + the [dtypes](https://numpy.org/doc/stable/reference/arrays.dtypes.html) in + our [example Breakout Board data-loading script](xref:breakout_load-data) + were selected according to the size of each data being saved. For example, + digital input clock samples are saved using 8 bytes which requires + `dt=np.uint64` when loading, and digital input pin samples are saved using a + single byte which requires `dt=np.uint8` when loading. + + diff --git a/articles/getting-started/onix-configuration.md b/articles/getting-started/onix-configuration.md index fbf61ba..84c02ad 100644 --- a/articles/getting-started/onix-configuration.md +++ b/articles/getting-started/onix-configuration.md @@ -54,7 +54,8 @@ The data acquisition process is started when ContextTask passes through . StartAcquisition allows the user to set parameters that are related to data acquisition such as ReadSize and WriteSize. Setting the ReadSize property for a particular workflow is a balancing act of minimizing latency of data data transfers from the ONIX -system and avoiding data accumulation in the ONIX system's hardware buffer. +system and avoiding data accumulation in the ONIX system's hardware buffer. To learn about the +process of tuning ReadSize, check out the tutorial. ::: workflow ![/workflows/getting-started/start-acquisition.bonsai workflow](../../workflows/getting-started/start-acquisition.bonsai) diff --git a/articles/tutorials/toc.yml b/articles/tutorials/toc.yml index f91f9f1..977e714 100644 --- a/articles/tutorials/toc.yml +++ b/articles/tutorials/toc.yml @@ -2,3 +2,4 @@ items: - href: ephys-processing-listening.md - href: ephys-socket.md + - href: tune-readsize.md diff --git a/articles/tutorials/tune-readsize.md b/articles/tutorials/tune-readsize.md new file mode 100644 index 0000000..15a0969 --- /dev/null +++ b/articles/tutorials/tune-readsize.md @@ -0,0 +1,437 @@ +--- +uid: tune-readsize +title: Optimizing Closed Loop Performance +--- + +This tutorial shows how to retrieve data from the ONIX hardware as quickly as +possible for experiments with strict low-latency closed-loop requirements by +tuning the workflow for your particular data sources and computer +specifications. In most situations, sub-200 microsecond closed-loop response +times can be achieved. + +> [!NOTE] +> Performance will vary based on your computer's capabilities and your results +> might differ from those presented below. The computer used to create this +> tutorial has the following specs: +> +> - CPU: Intel i9-12900K +> - RAM: 64 GB +> - GPU: NVIDIA GTX 1070 8GB +> - OS: Windows 11 + +## Data Transmission from ONIX Hardware to Host Computer + +ONIX is capable of transferring data directly from production to the +host computer. However, if the host is busy when ONIX starts +producing data, ONIX will temporarily store this new data in its hardware buffer +while it waits for the host to be ready to accept new data. + +Key details about this process: + +- The size of hardware-to-host data transfers is determined by the + property of the + operator which is in every Bonsai + workflow that uses to acquire data from ONIX. +- Increasing `ReadSize` allows the host to read larger chunks of data from + ONIX per read operation without significantly increasing the duration of the + read operation, therefore increasing the maximum rate at which data can be + read. +- If the host is busy or cannot perform read operations rapidly enough to keep + up with the rate at which ONIX produces data, the ONIX hardware buffer will + start to accumulate excessive data. +- Accumulation of excess data in the hardware buffer collapses real-time + performance and risks hardware buffer overflow which would prematurely + terminate the acquisition session. `ReadSize` can be increased to avoid this + situation. +- As long as this situation is avoided, decreasing `ReadSize` means that ONIX + doesn't need to produce as much data before the host can access it. This, + in effect, means software can start operating on data closer to the time + that the data was produced, thus achieving lower-latency feedback-loops. + +In other words, a small `ReadSize` can help the host access data sooner to when +that data was created. However, each data transfer incurs overhead. If +`ReadSize` is so small that ONIX produces a `ReadSize` amount of data faster +than the average time it takes the host computer to perform a read operation, +the hardware buffer will accumulate excessive data. This will destroy real-time +performance and eventually cause the hardware buffer to overflow, terminating +acquisition. The goal of this tutorial is to tune StartAcquisition's `ReadSize` +so that data flows from production to the software running on the host as +quickly as possible by minimizing the amount of time that it sits idly in both +the ONIX hardware buffer and the host computer's buffer. This provides software +access to the data as close to when the data was produced as possible which +helps achieve lower latency closed-loop feedback. + +### Technical Details + +> [!NOTE] +> This section explains more in-depth how data is transferred from ONIX to the +> host computer. Although these details provide additional context about ONIX, +> they are more technical and are not required for following the rest of the +> tutorial. + +When the host computer reads data from the ONIX +hardware, it retrieves a **ReadSize**-bytes sized chunk of data using the +following procedure: + +1. A `ReadSize`-bytes long block of memory is allocated on the host computer's + RAM by the host API for the purpose of holding incoming data from ONIX. +1. A pointer to that memory is provided to the + [RIFFA](https://open-ephys.github.io/ONI/v1.0/api/liboni/driver-translators/riffa.html) + driver (the PCIe backend/kernel driver for the ONIX system) which moves the + allocated memory block into a more privileged state known as kernel mode so + that it can initiate a [DMA + transfer](https://en.wikipedia.org/wiki/Direct_memory_access). DMA allows + data transfer to be performed by ONIX hardware without additional CPU + intervention. +1. The data transfer completes once this block of data has been populated with + `ReadSize` bytes of data from ONIX. +1. The RIFFA driver moves the memory block from kernel mode to user mode so + that it can be accessed by software. The API function returns with a pointer + to the filled buffer. + +During this process, memory is allocated only once by the API, and the transfer +is [zero-copy](https://en.wikipedia.org/wiki/Zero-copy). The API-allocated +buffer is written autonomously by ONIX hardware using minimal resources from +the host computer. + +So far, all this occurs on the host-side. Meanwhile, on the ONIX-side: + +- If ONIX produces new data before the host is able to consume the data in the + API-allocated buffer, this new data is added to the back of ONIX hardware + buffer FIFO. The ONIX hardware buffer consists of 2GB of RAM that belongs to + the acquisition hardware (it is _not_ RAM in the host computer) dedicated to + temporarily storing data that is waiting to be transferred to the host. Data + is removed from the front of the hardware buffer and transferred to the host + once it's ready to accept more data. +- If the memory is allocated on the host-side and the data transfer is + initiated by the host API before any data is produced, ONIX transfers new + data directly to the host bypassing the hardware buffer. In this case, ONIX + is literally streaming data to the host _the moment it is produced_. This + data becomes available for reading by the host once ONIX transfers the full + `ReadSize` bytes. + +## Tuning `ReadSize` to Optimize Closed Loop Performance + +ONIX provides a mechanism for tuning the value of `ReadSize` to optimize closed +loop performance that takes into account the idiosyncrasies of your host +computer and experimental acquisition setup. + +> [!NOTE] +> If you are not familiar with the basic usage of the `OpenEphys.Onix1` library, +> then visit the [Getting Started](xref:getting-started) guide to set up your +> Bonsai environment and familiarize yourself with using the library to acquire +> data from ONIX before proceeding. + +Copy the following workflow into the Bonsai workflow editor by hovering over the +workflow image and clicking on the clipboard icon that appears. Open Bonsai and +paste this workflow by clicking the Bonsai workflow editor pane and pressing +Ctrl+V. + +::: workflow +![SVG of load tester workflow](../../workflows/tutorials/tune-readsize/tune-readsize.bonsai) +::: + +### Hardware Configuration + +The top-row configuration chain includes a + operator. This configures ONIX's Load +Tester Device, which produces and consumes data at user-specified rates for +testing and tuning the latency between data production and real-time feedback. +This device is _not an emulator_. It is a real hardware device that produces and +consumes data using the selected driver and physical link (e.g. PCIe bus) and +thus provides accurate measurements of feedback performance for a given host +computer. + +::: workflow +![SVG of load tester workflow configuration chain](../../workflows/tutorials/tune-readsize/configuration.bonsai) +::: + +We need to configure the load tester to produce and consume the same amount of +data as our real experimental hardware would. For example, lets say that during +our closed loop experiment, feedback signals will be generated as a function of +data acquired from two Neuropixels 2.0 probes, each of which generates a 384 +channel sample at 30 kHz. The overall bandwidth is + +$$ +\begin{equation} + 2\,probes*\frac{384\,chan.}{probe}*\frac{30\,ksamp.}{sec\,chan.}*\frac{2\,bytes}{samp.} \approx 47\,MB/s + \label{eq:2xnpx2bw} +\end{equation} +$$ + +To understand how we came up with this calculation, visit the + page. + +We'll setup `ConfigureLoadTester` to produce data at the same frequency and +bandwidth as two Neuropixels 2.0 probes with the following settings: + +screenshot of ConfigureLoadTester's property editor + +- `DeviceAddress` is set to 11 because that's how this device is indexed in + the ONIX system. +- `DeviceName` is set to "Load Tester" +- `Enable` is set to True to enable the LoadTester device. +- `FramesPerSecond` is then set to 60,000 Hz. The rate at which frames are + produced by two probes, since each is acquired independently. +- `ReceivedWords` is set to 392 bytes, the size of a single + including its clock members. +- `TransmittedWords` is set to 100 bytes. This simulates the amount of data + required to e.g. send a stimulus waveform. + +> [!NOTE] +> The `DeviceAddress` must be manually configured because +> is used for diagnostics and testing +> and therefore is not made available through +> like the rest of the local +> devices (analog IO, digital IO, etc.). The device address can be found using +> [oni-repl](https://open-ephys.github.io/onix-docs/Software%20Guide/oni-repl/usage.html#repl-commands). + +Next we configure 's + and + properties. + +`WriteSize` is set to 16384 bytes. This defines a readily-available pool of +memory for the creation of output data frames. Data is written to hardware as +soon as an output frame has been created, so the effect on real-time performance +is typically not as large as that of the `ReadSize` property. + +To start, `ReadSize` is also set to 16384. Later in this tutorial, we'll examine +the effect of this value on real-time performance. + +### Real-time Loop + +The bottom half of the workflow is used to stream data back to the load testing +device from hardware so that it can perform a measurement of round trip latency. +The operator acquires a sequence of +[LoadTesterDataFrames](xref:OpenEphys.Onix1.LoadTesterDataFrame) from the +hardware each of which is split into its + member and + member. + +::: workflow +![SVG of load tester workflow loadtester branch](../../workflows/tutorials/tune-readsize/loadtester.bonsai) +::: + +The `HubClock` member indicates the acquisition clock count when the +`LoadTesterDataFrame` was produced. The `EveryNth` operator is a + operator which only allows through every Nth +element in the observable sequence. This is used to simulate an algorithm, such +as spike detection, that only triggers closed loop feedback in response to input +data meeting some condition. The value of `N` can be changed to simulate +different feedback frequencies. You can inspect its logic by double-clicking the +node when the workflow is not running. In this case, `N` is set to 100, so every +100th sample is delivered to . + +`LoadTesterLoopback` is a _sink_ which writes HubClock values it receives back +to the load tester device. When the load tester device receives a HubClock from +the host computer, it's subtracted from the current acquisition clock count. +That difference is sent back to the host computer as the `HubClockDelta` +property of subsequent `LoadTesterDataFrames`. In other words, `HubClockDelta` +indicates the amount of time that has passed since the creation of a frame in +hardware and the receipt of a feedback signal in hardware based on that frame: +it is a complete measurement of closed loop latency. This value is converted to +milliseconds and then is used to help visualize +the distribution of closed-loop latencies. + +Finally, at the bottom of the workflow, a + operator is used to examine the state +of the hardware buffer. To learn about the + branch, visit the [Breakout Board +Memory Monitor](xref:breakout_memory-monitor) page. + +::: workflow +![SVG of load tester workflow memorymonitor branch](../../workflows/tutorials/tune-readsize/memory-monitor.bonsai) +::: + +### Relevant Visualizers + +The desired output of this workflow are the [visualizers](xref:visualize-data) +for the Histogram1D and PercentUsed nodes. Below is an example of each which we +will explore more in the next section: + +![screenshot of Histogram1D visualizers with `ReadSize` 16384](../../images/tutorials/tune-readsize/histogram1d_16384.webp) +![screenshot of PercentUsed visualizers with `ReadSize` 16384](../../images/tutorials/tune-readsize/percent-used_16384.webp) + +The Histogram1D visualizer shows the distribution of closed-loop feedback +latencies. The x-axis is in units of μs, and the y-axis represents the number of +samples in a particular bin. The histogram is configured to have 1000 bins +between 0 and 1000 μs. For low-latency closed-loop experiments, the goal is to +concentrate the distribution of closed-loop feedback latencies towards 0 μs as +much as possible. + +The PercentUsed visualizer shows a time-series of the amount of the hardware +buffer that is occupied by data as a percentage of the hardware buffer's total +capacity. The x-axis is timestamps, and the y-axis is percentage. To ensure data +is available as soon as possible from when it was produced and avoid potential +buffer overflow, the goal is to maintain the percentage at or near zero. + +### Real-time Latency for Different `ReadSize` Values + +#### `ReadSize` = 16384 bytes + +With `ReadSize` set to 16384 bytes, start the workflow and open the visualizers for the PercentUsed and +Histogram1D nodes: + +![screenshot of Histogram1D visualizers with `ReadSize` 16384](../../images/tutorials/tune-readsize/histogram1d_16384.webp) +![screenshot of PercentUsed visualizers with `ReadSize` 16384](../../images/tutorials/tune-readsize/percent-used_16384.webp) + +The Histogram1D visualizer shows that the average latency is about 300 μs, with +most latencies ranging from ~60 μs to ~400 μs. This roughly matches our +expectations. Since data is produced at about 47MB/s, it takes about 340 μs to +produce 16384 bytes of data. This means that the data contained in a single +`ReadSize` block was generated in the span of approximately 340 μs. Because we +are using every 100th sample to generate feedback, the sample that is actually +used to trigger LoadTesterLoopback could be any from that 340 μs span resulting +in a range of latencies. The long tail in the distribution corresponds to +instances when the hardware buffer was used or the CPU was busy with other +tasks. + +The PercentUsed visualizer shows that the percent of the hardware buffer being +used remains close to zero. This indicates minimal usage of the hardware buffer, +and that the host is safely reading data faster than the ONIX produces that +data. For experiments without hard real-time constraints, this latency is +perfectly acceptable. + +For experiments with harder real-time constraints, let's see how much lower we +can get the closed-loop latency. + +#### `ReadSize` = 2048 bytes + +Set `ReadSize` to 2048 bytes, restart the workflow (`ReadSize` is a +[](xref:OpenEphys.Onix1#configuration) +property so it only updates when a workflow starts), and open the same visualizers: + +![screenshot of Histogram1D visualizers with `ReadSize` 2048](../../images/tutorials/tune-readsize/histogram1d_2048.webp) +![screenshot of PercentUsed visualizers with `ReadSize` 2048](../../images/tutorials/tune-readsize/percent-used_2048.webp) + +The Histogram1D visualizer shows closed-loop latencies now average about 80 +μs with lower variability. + +The PercentUsed visualizer shows the hardware buffer is still stable at +around zero. This means that, even with the increased overhead associated +with a smaller `ReadSize`, the host is reading data rapidly enough to prevent +excessive accumulation in the hardware buffer. Let's see if we can decrease +latency even further. + +#### `ReadSize` = 1024 bytes + +Set `ReadSize` to 1024 bytes, restart the workflow, and open the same visualizers. + +![screenshot of Histogram1D visualizers with `ReadSize` 1024](../../images/tutorials/tune-readsize/histogram1d_1024.webp) +![screenshot of PercentUsed visualizers with `ReadSize` 1024](../../images/tutorials/tune-readsize/percent-used_1024.webp) + +The Histogram1D visualizer appears to be empty. This is because the latency +immediately exceeds the x-axis upper limit of 1 ms. You can see this by +inspecting the visualizer for the node prior to Histogram1D. Because of the very +small buffer size (which is on the order of a single Neuropixels 2.0 sample), +the computer cannot perform read operations at a rate required to keep up with +data production. This causes excessive accumulation of data in the hardware +buffer. In this case, when new data is produced, it gets added to the end of the +hardware buffer queue, requiring several read operations before this new data +can be read. As more data accumulates in the buffer, the duration of time from +when that data was produced and when that data can finally be read increases. In +other words, latencies increase dramatically, and closed loop performance +collapses. + +The PercentUsed visualizer shows that the percentage of the hardware buffer that +is occupied is steadily increasing. The acquisition session will eventually +terminate in an error when the MemoryMonitor PercentUsed reaches 100% and the +hardware buffer overflows. + +#### Summary + +The results of our experimentation are as follows: + +| `ReadSize` | Latency | Buffer Usage | Notes | +| ----------- | -------------- | -------------- | -------------------------------------------------------------------------------------------------- | +| 16384 bytes | ~300 μs | Stable at 0% | Perfectly adequate if there are no strict low latency requirements, lowest risk of buffer overflow | +| 2048 bytes | ~80 μs | Stable near 0% | Balances latency requirements with low risk of buffer overflow | +| 1024 bytes | Rises steadily | Unstable | Certain buffer overflow and terrible closed loop performance | + +These results may differ for your experimental system. For example, your system +might have different bandwidth requirements (if you are using different devices, +data is produced at a different rate) or use a computer with different +performance capabilities (which changes how quickly read operations can occur). +For example, here is a similar table made by configuring the Load Tester device +to produce data at a rate similar to a single 64-channel Intan chip (such as +what is on the ), ~4.3 MB/s: + +![screenshot of ConfigureLoadTester's property editor for a single Intan chips](../../images/tutorials/tune-readsize/load-tester-configuration_properties-editor_64ch.webp) + +| `ReadSize` | Latency | Buffer Usage | Notes | +| ---------- | ------- | ------------ | --------------------------------------------------------------------------------- | +| 1024 bytes | ~200 μs | Stable at 0% | Perfectly adequate if that are no strict low latency requirements | +| 512 bytes | ~110 μs | Stable at 0% | Lower latency, no risk of buffer overflow | +| 256 bytes | ~80 μs | Stable at 0% | Lowest achievable latency with this setup, still no risk of buffer overflow | +| 128 bytes | - | - | Results in error -- 128 bytes is too small for the current hardware configuration | + +Regarding the last row of the above table, the lowest `ReadSize` possible is +determined by the size of the largest data frame produced by enabled devices +(plus some overhead). Even with the lowest possible `ReadSize` value, 256 bytes, +there is very little risk of overflowing the buffer. The PercentUsed visualizer +shows that the hardware buffer does not accumulate data: + +![](../../images/tutorials/tune-readsize/percent-used_256_lower-payload.png) + +> [!TIP] +> - The only constraint on `ReadSize` is the lower limit as demonstrated in +> the example of tuning for `ReadSize` for a single 64-channel Intan chip. +> We only tested `ReadSize` values that are a power of 2, but `ReadSize` can +> be fine-tuned further to achieve even tighter latencies if necessary. +> - **As of OpenEphys.Onix1 0.7.0:** As long as you stay above the minimum +> mentioned in the previous bullet point, `ReadSize` can be set to any value +> by the user. The OpenEphys.Onix1 Bonsai package will round this `ReadSize` +> to the nearest multiple of four and uses that value instead. For example, +> if you try to set `ReadSize` to 887, the software will use the value 888 +> instead. +> - If you are using a data I/O operator that has capacity to produce data at +> various rates (like ), test your chosen +> `ReadSize` by configuring the load tester to produce data at the lower and +> upper limits that you expect data to be produced during your experiment. +> This will help ensure excess data doesn't accumulate in the hardware +> buffer and desired closed-loop latencies are maintained throughout the +> range of data throughput of these devices. +> - Running other processes that demand the CPU's attention might cause +> spurious spikes in data accumulation in the hardware buffer. Either reduce +> the amount other processes or test that they don't interfere with your +> experiment. + +These two tables together demonstrate why it is impossible to recommend a +single correct value for `ReadSize` that is adequate for all experiments. The +diversity of experiments (in particular, the wide range at which they produce +data) requires a range of `ReadSize` values. + +Last, in this tutorial, there was minimal computational load imposed by the +workflow used in this tutorial. In most applications, some processing is +performed on the data to generate the feedback signal. It's important to take +this into account when tuning your system and potentially modifying the workflow +to perform computations on incoming data in order to account for the effect of +computational demand on closed loop performance. + +### Measuring Latency in Actual Experiment + +After tuning `ReadSize`, it is important to experimentally verify the latencies +using the actual devices in your experiment. For example, if your feedback +involves toggling ONIX's digital output (which in turn toggles a stimulation +device like a [Stimjim](https://github.com/open-ephys/stimjim) or a [RHS2116 +external trigger](xref:OpenEphys.Onix1.ConfigureRhs2116Trigger.TriggerSource)), +you can loop that digital output signal back into one of ONIX's digital inputs +to measure when the feedback physically occurs. This can be used to measure your +feedback latency by taking the difference between the clock count when the +trigger condition occurs and the clock count when the feedback signal is +received by ONIX. + +You might wonder why you'd even use the LoadTester device if you can measure +latency using the actual devices that you intend to use in your experiment. The +benefit of the LoadTester device is that you're able to collect at least tens of +thousands of latency samples to plot in a histogram in a short amount of time. +Trying to use digital I/O to take as many latency measurements in a similar +amount of time can render your latency measurements inaccurate for the actual +experiment you intend to perform. In particular, toggling digital inputs faster +necessarily increases the total data throughput of +`DigitalInput`. If the data throughput of +`DigitalInput` significantly exceeds what is required for your experiment, +the latency measurements will not reflect the latencies you will experience +during the actual experiment. + + \ No newline at end of file diff --git a/images/tutorials/tune-readsize/histogram1d_1024.webp b/images/tutorials/tune-readsize/histogram1d_1024.webp new file mode 100644 index 0000000..bb975f2 Binary files /dev/null and b/images/tutorials/tune-readsize/histogram1d_1024.webp differ diff --git a/images/tutorials/tune-readsize/histogram1d_16384.webp b/images/tutorials/tune-readsize/histogram1d_16384.webp new file mode 100644 index 0000000..259e3e7 Binary files /dev/null and b/images/tutorials/tune-readsize/histogram1d_16384.webp differ diff --git a/images/tutorials/tune-readsize/histogram1d_2048.webp b/images/tutorials/tune-readsize/histogram1d_2048.webp new file mode 100644 index 0000000..93db118 Binary files /dev/null and b/images/tutorials/tune-readsize/histogram1d_2048.webp differ diff --git a/images/tutorials/tune-readsize/load-tester-configuration_properties-editor.webp b/images/tutorials/tune-readsize/load-tester-configuration_properties-editor.webp new file mode 100644 index 0000000..5399714 Binary files /dev/null and b/images/tutorials/tune-readsize/load-tester-configuration_properties-editor.webp differ diff --git a/images/tutorials/tune-readsize/load-tester-configuration_properties-editor_64ch.webp b/images/tutorials/tune-readsize/load-tester-configuration_properties-editor_64ch.webp new file mode 100644 index 0000000..c1269a0 Binary files /dev/null and b/images/tutorials/tune-readsize/load-tester-configuration_properties-editor_64ch.webp differ diff --git a/images/tutorials/tune-readsize/percent-used_1024.webp b/images/tutorials/tune-readsize/percent-used_1024.webp new file mode 100644 index 0000000..f70338c Binary files /dev/null and b/images/tutorials/tune-readsize/percent-used_1024.webp differ diff --git a/images/tutorials/tune-readsize/percent-used_16384.webp b/images/tutorials/tune-readsize/percent-used_16384.webp new file mode 100644 index 0000000..5effe5e Binary files /dev/null and b/images/tutorials/tune-readsize/percent-used_16384.webp differ diff --git a/images/tutorials/tune-readsize/percent-used_2048.webp b/images/tutorials/tune-readsize/percent-used_2048.webp new file mode 100644 index 0000000..6231002 Binary files /dev/null and b/images/tutorials/tune-readsize/percent-used_2048.webp differ diff --git a/images/tutorials/tune-readsize/percent-used_256_lower-payload.png b/images/tutorials/tune-readsize/percent-used_256_lower-payload.png new file mode 100644 index 0000000..80c5080 Binary files /dev/null and b/images/tutorials/tune-readsize/percent-used_256_lower-payload.png differ diff --git a/src/bonsai-onix1 b/src/bonsai-onix1 index b9bdc3c..6dbe5b8 160000 --- a/src/bonsai-onix1 +++ b/src/bonsai-onix1 @@ -1 +1 @@ -Subproject commit b9bdc3c0bb340843ff4288b6145308176307d81e +Subproject commit 6dbe5b83875fcb14d7e187d03aaf9c9e7da146df diff --git a/template/partials/hardware/configuration.tmpl.partial b/template/partials/hardware/configuration.tmpl.partial index dadc33d..ba173a2 100644 --- a/template/partials/hardware/configuration.tmpl.partial +++ b/template/partials/hardware/configuration.tmpl.partial @@ -61,8 +61,10 @@ {{{blockReadSize}}} bytes, meaning data collection will wait until {{{blockReadSize}}} bytes of data have been produced by the hardware. At {{{dataRate}}} MB/s the hardware will produce {{{blockReadSize}}} bytes every ~{{{timeUntilFullBuffer}}}. This is a hard bound on the latency of - the system. If lower latencies were required, the hardware would need to produce data more quickly - or the ReadSize property value would need to be reduced. + the system. If lower latencies are required, the hardware would need to produce data more quickly + or the ReadSize property value would need to be reduced. To learn about the process of tuning ReadSize, + check out the + Tune ReadSize tutorial.

diff --git a/workflows/operators/ConfigureLoadTester.bonsai b/workflows/operators/ConfigureLoadTester.bonsai index fc4b0bc..eb26764 100644 --- a/workflows/operators/ConfigureLoadTester.bonsai +++ b/workflows/operators/ConfigureLoadTester.bonsai @@ -1,5 +1,5 @@  - @@ -13,12 +13,11 @@ - LoadTester 0 - true + false 0 0 - 1000 + 0 diff --git a/workflows/operators/LoadTesterData.bonsai b/workflows/operators/LoadTesterData.bonsai new file mode 100644 index 0000000..b11c656 --- /dev/null +++ b/workflows/operators/LoadTesterData.bonsai @@ -0,0 +1,29 @@ + + + + + + + Load Tester + + + + HubClock + + + HubClockDelta + + + Counter + + + + + + + + + \ No newline at end of file diff --git a/workflows/operators/LoadTesterLoopback.bonsai b/workflows/operators/LoadTesterLoopback.bonsai new file mode 100644 index 0000000..86fe563 --- /dev/null +++ b/workflows/operators/LoadTesterLoopback.bonsai @@ -0,0 +1,31 @@ + + + + + + + Load Tester + + + + HubClock + + + + Load Tester + + + + HubClockDelta + + + + + + + + + \ No newline at end of file diff --git a/workflows/tutorials/tune-readsize/configuration.bonsai b/workflows/tutorials/tune-readsize/configuration.bonsai new file mode 100644 index 0000000..551b168 --- /dev/null +++ b/workflows/tutorials/tune-readsize/configuration.bonsai @@ -0,0 +1,103 @@ + + + + + + + riffa + 0 + + + + + Load Tester + 11 + true + 392 + 100 + 60000 + + + + + BreakoutBoard + + BreakoutBoard/PersistentHeartbeat + 0 + 100 + + + BreakoutBoard/AnalogIO + 6 + false + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + Input + Input + Input + Input + Input + Input + Input + Input + Input + Input + Input + Input + + + BreakoutBoard/DigitalIO + 7 + false + 0 + + + + BreakoutBoard/OutputClock + 5 + false + 1000000 + 50 + 0 + + + BreakoutBoard/HarpSyncInput + 12 + false + Breakout + + + BreakoutBoard/MemoryMonitor + 10 + true + 100 + + + + + + 16384 + 16384 + + + + + + + + + + \ No newline at end of file diff --git a/workflows/tutorials/tune-readsize/loadtester.bonsai b/workflows/tutorials/tune-readsize/loadtester.bonsai new file mode 100644 index 0000000..f2b6825 --- /dev/null +++ b/workflows/tutorials/tune-readsize/loadtester.bonsai @@ -0,0 +1,92 @@ + + + + + + + Load Tester + + + + HubClock + + + Every Nth + + + + Source1 + + + + + + Index + + + + + + + 100 + + + + + 0 + + + + + + + + + + + + + + + + + Load Tester + + + + HubClockDelta + + + + + + ToMilliseconds + it/250000.0 + + + + 0 + 1 + 1000 + false + true + + + + + + + + + + + + + + \ No newline at end of file diff --git a/workflows/tutorials/tune-readsize/memory-monitor.bonsai b/workflows/tutorials/tune-readsize/memory-monitor.bonsai new file mode 100644 index 0000000..7e2cccd --- /dev/null +++ b/workflows/tutorials/tune-readsize/memory-monitor.bonsai @@ -0,0 +1,21 @@ + + + + + + + BreakoutBoard/MemoryMonitor + + + + PercentUsed + + + + + + + \ No newline at end of file diff --git a/workflows/tutorials/tune-readsize/tune-readsize.bonsai b/workflows/tutorials/tune-readsize/tune-readsize.bonsai new file mode 100644 index 0000000..01fb85b --- /dev/null +++ b/workflows/tutorials/tune-readsize/tune-readsize.bonsai @@ -0,0 +1,192 @@ + + + + + + + riffa + 0 + + + + + Load Tester + 11 + true + 392 + 100 + 60000 + + + + + BreakoutBoard + + BreakoutBoard/PersistentHeartbeat + 0 + 100 + + + BreakoutBoard/AnalogIO + 6 + false + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + TenVolts + Input + Input + Input + Input + Input + Input + Input + Input + Input + Input + Input + Input + + + BreakoutBoard/DigitalIO + 7 + false + 0 + + + + BreakoutBoard/OutputClock + 5 + false + 1000000 + 50 + 0 + + + BreakoutBoard/HarpSyncInput + 12 + false + Breakout + + + BreakoutBoard/MemoryMonitor + 10 + true + 100 + + + + + + 16384 + 16384 + + + + + Load Tester + + + + HubClock + + + Every Nth + + + + Source1 + + + + + + Index + + + + + + + 100 + + + + + 0 + + + + + + + + + + + + + + + + + Load Tester + + + + HubClockDelta + + + + + + ToMilliseconds + it/250000.0 + + + + 0 + 1 + 1000 + false + true + + + + + BreakoutBoard/MemoryMonitor + + + + PercentUsed + + + + + + + + + + + + + + + + + \ No newline at end of file