# Project 1B: Analysis: Statistics
---
> - Evelyn Johns
> - Ava Pontidas
> - Charlotte Betts
---

In many of our GUIs the calculation and display of various statistical properties will be useful. In this group project you will look at a few standard statistics, and also suggest your own that you think could be useful.

We will consider statistics in the following ways:
- Individual signal; for example the sum of an individual signal, mean, etc
- Individual signal over time; for example the cumulative mean (AKA rolling average), and RMS
- Selection of signals; for example sum / mean of a selection of BLMs
- Selection of signals over time; for example a rolling average of the sum of all BLMs

With the addition of selecting time intervals to consider for these statistics.

As this data is received currently at 1 Hz (maybe in the future at 10 Hz or more), it is important to define **fast** and **accurate** methods to analyse our data!

<div class="alert alert-block alert-info">
<b>General Advice</b>

> In your groups:
> - Start by planning your work
> - You will need fundamental functions; try to plan what these should do, what arguments they should take, which should have default values (some default values will need research - ask us!), and what they should return. Don't forget to consider speed and the format of the data (float? int? array of floats? should the array be a certain length?...)
> - Ask for feedback on your plan to save time
> - Document your functions so others can understand / use / modify them
> - Build more complex functions using the fundamental functions, this may require you to go back and modify the fundamental functions.
> - Some things are suitable for objects (classes), others are not. Speed and accuracy are important.


> When coding a function (unless you have a preferred method):
> 1. Plan your code in bullet points / pseudo-code
> 2. Template your functions - put comments to say what you want to do
> 3. Implement your code step-by-step. Pair programming here is your friend and will catch many mistakes, as well as give 2 people enough experience of the code to help debug it
> 4. Test your code and document the testing
> 5. Provide standard tests <b>only if requested by Haroon/Esher</b> (define an input with a known output and assert that it is the same, or within a tolerance)    
> 6. Provide examples of use for each use case (different behaviour with optional arguments etc)
> 7. Turn your functions into objects if suitable
> 8. Get feedback from peers / supervisors / team-mates / users

> - <b> Make a repository to hold your group project</b> (public so we can all access it at need), try to keep this well organised and easy to navigate / use / find something
> - Make a Jupyter notebook for each investigation / function / method: such that future users (perhaps you) can use these notebooks to construct the code required to make future GUIs work
> - Feel free to store individual work on your own 'dump' repository, and copy completed work to your group repository


</div>

---
## Tasks:

- [ ] Function to sum the data in a single signal
- [ ] Function to calculate mean of a single signal
- [ ] Time selection for above functions (e.g. sum of signal between 3 - 7 ms)
- [ ] Split the function into N sections, given an input of boundaries, example:
    - input array = [-0.5, 0.0] gives the same output as previous function
    - input array = [3.0, 4.6, 5.5] gives output stats between 3.0 - 4.6 ms, and 4.6 - 5.5 ms
    - input array = [3.0, 4.6, 2.2] raises an exception because the last value goes backwards in time (built in exception types are found here: https://docs.python.org/3/library/exceptions.html) and examples can be found on the web)
    - input array = [-0.5, 0.0, 3.5, 9.5, 10.5] gives 4 outputs, for the intervals (-0.5 - 0.0), (0.0 - 3.5), (3.5 - 0.5), (9.5, 10.5)
- [ ] Functions to calculate the cumulative sum / mean, as data arrives, with a moving window of N MQTT signals (example; rolling window of last 5 seconds of data)
- [ ] Above function with time interval selection, and split into N sections given input of boundaries
- [ ] Investigate the fastest ways to do these (optimise your functions) using pythons time library to test

###  Extensions
Complete all tasks before starting extensions

- [ ] Create similar functions (with time selection / splitting) for other useful statistics (**Check with Esher/Haroon before doing the work**)
- [ ] **Collaborate with Project_1_E: File I/O:** store data to file or files (think about how it should be formatted, labelled etc), and load the data from said file or files
- [ ] **Collaborate with Project_1_E: File I/O:** read time selection / splitting boundaries from a configuration file
- [ ] Time how long it takes (using python time library) to read input data from a configuration file
- [ ] Define standard configuration files and methods for:
    - [ ] time / interval selection
- [ ] Think of ways to optimise reading configuration data (objects might help here)

### Goal
#### Provide methods (functions / classes) to:
- [ ] perform statistical analysis on single signals
- [ ] perform statistical analysis on single signals over time 
- [ ] perform statistical analysis on multiple signals
- [ ] perform statistical analysis on multiple signals over time 
- [ ] allow the user to perform the above analysis on multiple time intervals of the MQTT signal data
- [ ] allow the user to save/load the statistical data to/from file
- [ ] allow the user to load configurations from file