# Project 1E: Data Handling: File Input/Output (AKA File I/O)
---
> - Harrison Mills
> - Nathanial Patterson
> - Tom Sutton
---
When handling data at 1 Hz with a goal to handling it at 50 Hz, our codes should be efficient and as fast as possible. One way that we can optimise our GUIs is to consider how to handle data in order to minimise the time taken for operation. For example, should we pass the maximum energy to all functions that need it, when this number does not change very often? As it can change, it should not be hard coded. One method is to read it from a file - this does not makes sense for this value alone, however we will be storing and changing many variables.

In this sense you will define a few configuration file templates. Most GUIs will require calibration data to convert units, some will require tolerances or limits, most will require GUI specific settings. You will define methods to read data from files, and investigate the fastest way of reading and storing them as python data. 

Many of our GUIs will produce useful data. Some may be useful as experimental data, others as logs of machine performance. As such many stage 2 projects (GUIs) will include goals to store / load datasets. You will define those methods as part of this project. You will use our 40 x 2200 MQTT data, henceforth referred to as our dataframe, as an example, and investigate the fastest ways to store this data to file, and load it back from a file.

Note that **many other groups have been instructed to collaborate with you as their extensions**. Your research will be important in optimising our GUIs performance.

<div class="alert alert-block alert-info">
<b>General Advice</b>

> In your groups:
> - Start by planning your work
> - You will need fundamental functions; try to plan what these should do, what arguments they should take, which should have default values (some default values will need research - ask us!), and what they should return. Don't forget to consider speed and the format of the data (float? int? array of floats? should the array be a certain length?...)
> - Ask for feedback on your plan to save time
> - Document your functions so others can understand / use / modify them
> - Build more complex functions using the fundamental functions, this may require you to go back and modify the fundamental functions.
> - Some things are suitable for objects (classes), others are not. Speed and accuracy are important.


> When coding a function (unless you have a preferred method):
> 1. Plan your code in bullet points / pseudo-code
> 2. Template your functions - put comments to say what you want to do
> 3. Implement your code step-by-step. Pair programming here is your friend and will catch many mistakes, as well as give 2 people enough experience of the code to help debug it
> 4. Test your code and document the testing
> 5. Provide standard tests <b>only if requested by Haroon/Esher</b> (define an input with a known output and assert that it is the same, or within a tolerance)    
> 6. Provide examples of use for each use case (different behaviour with optional arguments etc)
> 7. Turn your functions into objects if suitable
> 8. Get feedback from peers / supervisors / team-mates / users

> - <b> Make a repository to hold your group project</b> (public so we can all access it at need), try to keep this well organised and easy to navigate / use / find something
> - Make a Jupyter notebook for each investigation / function / method: such that future users (perhaps you) can use these notebooks to construct the code required to make future GUIs work
> - Feel free to store individual work on your own 'dump' repository, and copy completed work to your group repository


</div>

---
## Tasks:

- [ ] Function to save dataframe (R5IM and BLM signals in 40 x 2200 2D array) to file, and read data from file back into 2D array python format
    - [ ] Consider the number of significant figures to store / load, the formatting of the file etc
    - [ ] Investigate different filetypes - start with .csv, then investigate other options e.g. .dat (ASCII text file), ...?
    - [ ] Investigate different data structures - pandas, polars, pyarrow? Most will have inbuilt functions to read/write to file, but there may be overhead (extra time taken) in converting data into different python formats
    - [ ] Compare speed and accuracy
- [ ] Using your research from above or otherwise, define fast and accurate methods to:
    - [ ] Save arbitrary data to an easy-to-use filetype
    - [ ] Load arbitrary data from the same filetype
    - These methods will be used to store data from GUIs, and to load configurations to GUIs via 'configuration files'

### Extensions:
Complete all tasks before starting extensions

- [ ] **Collaborate with all other projects that request your input**
- [ ] **Collaborate with Project_1_A: Data Verification:** Define configuration file template for BLM / R5IM selection
- [ ]  **Collaborate with Project_1_B: Statistics:** read time selection / splitting boundaries from a configuration file

### Goal
#### Provide methods (functions / classes) to:
- [ ] Allow the user to store and load MQTT signal dataframe (40 x 2200 data points) to/from a .csv file
- [ ] Allow the user to store and load MQTT signal dataframe (40 x 2200 data points) to/from other file types if faster
- [ ] Define fast and accurate methods for storing data to file
- [ ] Define fast and accurate methods for loading data from file
- [ ] Define template configuration files that can be used to store / set / load important parameters for future GUIs
- [ ] Optimise the storing / loading of data from configuration files