# <center>Comparison of the (Dis)charge Rates of Nickel Cadmium Batteries</center>
## <center>Open Research Institute, Inc. (ORI)</center>
## <center>2024</center>

__<center>Contributors:</center>__
<center>Michael Easton, Lab Tech 1</center>
<center>R. Easton, Lab Tech 3</center>

__<center>Supervisors:</center>__
<center>Michelle Thompson, CEO</center>
<center>Paul Williamson, Advisor</center>

# Abstract
By utilizing data created by the UBA software, we are able to compare and contrast all of our battery curves after isolating them. Using this comparison, we are able to group each cell into collection of batteries based on the type of cycle used and the number of cycles used and test the quality of the battery. 
# Methods  
## Mathematical Component  
   By finding the differences between cycles in the data and comparing the overall sum, we can find the general approximation of difference between two battery cells. Utilizing this, we can apply a grouping method to group the least general difference between cells, then utilize those groups to create our optimized batteries.  
# Data Conditioning  
Links to all utilized code for this section and additional documentation such as manuals can be found at the [Project-NiCd GitHub repository.](https://github.com/OpenResearchInstitute/documents/tree/master/Remote_Labs/Project_NiCd)  
## Importing Libraries
We will be utilizing `os`, `pandas`, `itertools`, `numpy` and `contextlib`. Run the command `pip3 install pandas` and `pip3 install numpy` if you do not have these modules installed yet.  

In [1]:
from contextlib import redirect_stdout # For printing our files
import os # OS File system utilization in printing files
import pandas as pd #Manipulating our data frames
import numpy as np #Utilizing numpy commands

## Obtaining our UBA Files
The data type we stored our notes for what happened during our tests of the individual battery cells is stored on a .uba data file. To access the notes so we can start to group our data, we will need to iterate through all of the .uba files, then add the portion containing the notes to its own separate file. 
### Iterating through our UBA files
First, we will start by defining terms we will use in our iteration. We need to define the folder that our .uba files are in, which will be wherever you have the testing data stored. This code was made by Paul Williamson/KB5MU.

In [2]:
uba_folder = '' 

You MUST include your personal directory for your download of the file. Do not use the placeholder used in this walk-through as the directory. 


Next, we will define the file extension we are looking for to collect. Because the files we are looking for are the only .uba files in our directory, we will set this to be .uba.

In [3]:
uba_extension = ".uba"

To set the addendum to the file name, and to serve as what file format we are saving our output in, we will define another object as the output extension.

In [4]:
output_extension = "_summmary.txt"

### Write out our `<Messages>` block to file

Next, we will iterate through our `.uba` files in the `uba_folder` directory. To do this, we will utilize a <font color="green">with</font> and an <font color="green">open</font> statement to open a file in the <font color="blue">listdir</font> of our directory, run our code, then close the file. 

To start with, we will create a <font color="green">for</font> loop to iterate through the filenames in our `uba_folder` directory.  

```python
for filename in os.listdir(uba_folder):
```

Then, we will indent and use an <font color="green">if</font> statement to specify that, if the file has the extension of uba_extension, .uba, we will use it in our further code. For this, we use the <font color="blue">endswith</font> attribute of `filename`.

```python
for filename in os.listdir(uba_folder):	
	if filename.endswith(uba_extension):	
```
Next, to open the files we want only for the duration of the code and then close them, we will use a series of <font color="green">with</font> and <font color="green">open</font> statements. We will then use our open file <font color="green">as</font> `f` to shorten it. os.<font color="blue">path</font>.<font color="blue">join</font> will join the `filename` and the directory it is located in into one path. <font color="red">'r'</font> being included means it will only read the file, not read and write. Additionally, we have included the type of encoding, <font color="red">cp1252</font>, so that the file can be properly read by the code.

```python
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:			
```
We will use another <font color="green">with</font> statement nested inside of our previous one to open our print-out files where we will see our summaries. To differentiate the files and to save our file as a readable format, we will add the filename with the output extension, then set the open function to write (with the string <font color="red">'w'</font>). We call that open file `outf`.

```python
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:
            with open(filename+output_extension, 'w') as outf:
```
Now that we have described our loops and conditional statements, we can now input our code. After indenting one more time,  we will print the `filename`.

```python
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:
            with open(filename+output_extension, 'w') as outf:
                print(filename)
```

Next, we will set up a boolean that will state the function `at_messages` is currently <font color="green">False</font>. We are doing this to set up our stop for where the `<Messages>` block begins. 

```python
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:
            with open(filename+output_extension, 'w') as outf:
                print(filename)
                at_messages = False
```
Because we want to determine by what line we are cutting our data off, we will utilize another <font color="green">for</font> loop to iterate over our <font color="green">open</font>. `.uba` files. 
```python
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:
            with open(filename+output_extension, 'w') as outf:
                print(filename)
                at_messages = False
                for line in f:
```

Underneath this loop, we will indent another time and pass a conditional <font color="green">if</font> statement to check if the line has the string 
`'<Messages>'`. 
```python
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:
            with open(filename+output_extension, 'w') as outf:
                print(filename)
                at_messages = False
                for line in f:
                    if line.strip() == "<Messages>":
```

Next, after indenting one more time,  we will make our previous boolean, `at_messages`, <font color="green">True</font>. Utilizing this boolean, we will pass another <font color="green">if</font> statement, and print our output in our currently open writable file. Altogether, our code should look like this.


In [5]:
for filename in os.listdir(uba_folder):
    if filename.endswith(uba_extension):
        with open(os.path.join(uba_folder, filename), 'r', encoding='cp1252') as f:
            with open(filename+output_extension, 'w') as outf:
                print(filename)
                at_messages = False
                for line in f:
                    if line.strip() == "<Messages>":
                        at_messages = True
                    if at_messages:
                        print(line.strip(), file=outf)

FileNotFoundError: [WinError 3] The system cannot find the path specified: ''

### Manual Comparison
As a conscious choice, we have chosen not to attempt to automate this process because of the potential difficulty costing more time than sorting each .uba summary.txt manually. However, if wanted, you can attempt to automate this process. In this experiment however, we have sorted each file into their corresponding file types manually. We found that each file contains two identifiable markers that signify the start of the discharge and the start of the long 3 hour equalization charge. We are generally able to tell which file has what amount of cycles because of this. In case we want to utilize later cycles of the batteries, we can only use sixgroup. However, all other files have at least one cycle to use, so we can utilize all of the files for the first cycle analysis.  

1HDC = 1 Hour Discharge  
3HEC = 3 Hour Equalization Charge

In [10]:

#1HDC, 3HEC, 1HDC
threegroup = [19, 20, 83, 84]

#1HDC, 3HEC, 1HDC, 3HEC
fourgroup = [6, 4, 3, 2, 1]

#1HDC, 3HEC, 1HDC, 3HEC, 1HDC
fivegroup = [95, 96, 94, 93, 65, 54, 53, 31, 30, 29, 15]

#1HDC, 3HEC, 1HDC, 3HEC, 1HDC, 3HEC
sixgroup = [91, 89, 88, 87, 86, 85, 80, 79, 78, 77, 74, 73, 72, 71, 66, 64, 63, 62, 61, 60, 59, 58, 57, 56, 55, 52, 51, 50, 49, 46, 45, 42, 41, 40, 39, 38, 37, 36, 35, 34, 32, 28, 26, 25, 21, 22, 21, 18, 17, 16,  14, 13]


## Defining Process
By creating a dictionary, we are capable of assigning each read .csv file to its corresponding file number. First, we will define the range of usable files from our dataset as a list. After iterating through which files exist, we have created a range of all possible useable files. You may or may not need to do this depending on your data collection. 

In [2]:
trurange = list(range(1,69))+list(range(83,99)) 

Next, we will create our `file` dictionary. This will make a dictionary (each key corresponds to an entry) to store our formatted and read files in. 

In [3]:
file = {}

The strings utilized in our data provide a clear delineation between the start and end of the discharge and charge cycles. Because of this, we will use them for truncating our dataframe later on. To do this, we will define them with an easily readable name. 

`dischargestart` is the start of the discharge cycle, graphed as a plateau in the data

In [4]:
dischargestart = 'Finish=MaxChargeC' 

`chargestart` is the end of the discharge cycle, the start of the charge cycle, and is graphed as a trough in the data.


In [5]:
chargestart = 'Finish=CutoffV' 

`split` is the split halfway through the charge cycle as the current increases, and appears as the point where the charge cycle changes.

In [6]:
split = 'Finish=MaxDischargeTime' 

`end` is the recorded user end of the data file.

In [7]:
end = 'Finish=UserRequest:Stop' 

### Reading our .csv files
The next step our defining process allows us to use our `file` dictionary to assign each .csv file a corresponding key in the dictionary that is the same as the number attached to the file. To do this, we will utilize our previous range of data we described as trurange (which is a list of all useable file numbers) in a <font color="green">for</font> loop to assign the key and value to eachother.

First, we make our <font color="green">for</font> loop. 

```python
for x in trurange:
```

Next, we will insert a <font color="green">try</font> statement after our <font color="green">for</font> statement. We are doing this so that for files that do not work, we can pass by them, and continue assigning files to the dictionary that do work.  

```python
for x in trurange:
    try:
```

After indenting another time, we will start to assign our dictionary (file) keys to values. 

```python
for x in trurange:
    try:
        file[x] = (
```

After setting our assignment up, we will place our file reader, pd.<font color="blue">read_csv</font>, on the next line, then leave its paranthesis open. pd.<font color="blue">read_csv</font> will read each file and create a `dataframe` for each.

```python
for x in trurange:
    try:
        file[x] = (
             pd.read_csv(
```
 Now, we must specify the name of our files in an <font color="red">f-string</font> or formatted string which will allow each name to assigned separately. Because of our previous for loop looping through x in the range of file numbers, we can simply place x in square brackets inside of this formatted string to format in each file number.  

```python
for x in trurange:
    try:
        file[x] = (
             pd.read_csv(
                f'NiCd - Key Lime #{x}(data).csv', 
```

Next, we will add two arguments to ensure all of the data is read as is and nothing is removed.

```python
for x in trurange:
    try:
        file[x] = (
             pd.read_csv(
                f'NiCd - Key Lime #{x}(data).csv', 
                skip_blank_lines=False, #We want all index numbers to stay the same
                keep_default_na=True, #We want to keep all of our N/A values for the same reason
```

Now that we have ensured our file is being read properly, we will define the names of each of the columns. Our data format has 5 columns, in order being the time in seconds, the voltage, the current, the temperature in celsius, and the strings attached to the data by the software used. By specifying the names of each of the columns, we're able to refer to the columns specifically in any circumstance where we're specifying `file[x]` by instead using `file[x]`[<font color="red">'Column'</font>]. We will also close the parantheses we previously opened.

```python
for x in trurange:
    try:
        file[x] = (
             pd.read_csv(
                f'NiCd - Key Lime #{x}(data).csv', 
                skip_blank_lines=False, #We want all index numbers to stay the same
                keep_default_na=True, #We want to keep all of our N/A values for the same reason
                names=['Time', 'Volt', 'M/A', 'C', 'Sig'],
            )
        )
```

Next, we will close our <font color="green">try</font> loop by including the <font color="green">except</font> statement, which is used by the <font color="green">try</font> statement when <font color="green">try</font>ing to run the included code fails. We have included a printout of an error message with an f-string to tell us which files have failed, if any. You can utilize this formatting of the code to run a test for which files are useable, then pass a list similar to trurange to only select for files which are useable. The final code should look something like this.

In [8]:
for x in trurange:
    try:
        file[x] = (
             pd.read_csv(
                f'NiCd - Key Lime #{x}(data).csv', 
                skip_blank_lines=False, #We want all index numbers to stay the same
                keep_default_na=True, #We want to keep all of our N/A values for the same reason
                names=['Time', 'Volt', 'M/A', 'C', 'Sig'],
            )
        )
    except:
        print(f'fail in {x}')

fail in 1
fail in 2
fail in 3
fail in 4
fail in 5
fail in 6
fail in 7
fail in 8
fail in 9
fail in 10
fail in 11
fail in 12
fail in 13
fail in 14
fail in 15
fail in 16
fail in 17
fail in 18
fail in 19
fail in 20
fail in 21
fail in 22
fail in 23
fail in 24
fail in 25
fail in 26
fail in 27
fail in 28
fail in 29
fail in 30
fail in 31
fail in 32
fail in 33
fail in 34
fail in 35
fail in 36
fail in 37
fail in 38
fail in 39
fail in 40
fail in 41
fail in 42
fail in 43
fail in 44
fail in 45
fail in 46
fail in 47
fail in 48
fail in 49
fail in 50
fail in 51
fail in 52
fail in 53
fail in 54
fail in 55
fail in 56
fail in 57
fail in 58
fail in 59
fail in 60
fail in 61
fail in 62
fail in 63
fail in 64
fail in 65
fail in 66
fail in 67
fail in 68
fail in 83
fail in 84
fail in 85
fail in 86
fail in 87
fail in 88
fail in 89
fail in 90
fail in 91
fail in 92
fail in 93
fail in 94
fail in 95
fail in 96
fail in 97
fail in 98


## Acquriing Index Locations

### Defining our Locations to Dictionaries
By creating dictionaries with assigned keys corresponding to the file number and values being the positional locations of each of the delineating strings, we can use these dictionaries at any point to refer to any files start, stop and split in the data. The first cycle in all files is a preparation for the battery that is not used for this experiment, so we refer to the cycle that is being use, by technicality the second one, as the "first" and "true" to differentiate from the apparent first cycle. We will start by defining our dictionaries. 

`locsigdc` is the first true discharge location.

In [18]:
locsigdc = {} #first true discharge positional location

`locsigch` is the first true charge location.

In [19]:
locsigch = {} #first true charge positional location

`locsigsp` is the first true split in the charge cycle's location.

In [20]:
locsigsp = {} #first true split positional location

`locsigend` is the end of the first charge cycle's location.

In [21]:
locsigend = {} #end of the the first charge cycle

Next, we will creating a <font color="green">for</font> loop that iterates over the keys stored by our file dictionary, utilizing only the keys that have a successful corresponding `dataframe`.

```python
for x in file.keys():
```

The next statement we will pass is a series of if statement that utilizes the .<font color="blue">loc</font>[] attribute of pandas to acquire the row of the strings, then specifying its index via the .<font color="blue">index</font> attribute. After this, we pass it to a list using .<font color="blue">tolist</font>() that we can then refer to specific locations by using elements. Then, utilizing the <font color="green">len</font> function, we will describe our if statements to only apply the following code to files in which every list of index locations is <font color="green">greater than 3</font>- we do not want to use files that have broken data in which tests were interrupted or incomplete and only recorded 0, 1 or 2 instances of each string.


```python
for x in file.keys(): 
    if len(file[x].loc[file[x]['Sig'] == dischargestart].index.tolist()) >=3:
        if len(file[x].loc[file[x]['Sig'] == chargestart].index.tolist()) >=3:
            if len(file[x].loc[file[x]['Sig'] == split].index.tolist()) >=3:
```

Our next <font color="green">if</font> statement determines that our index locations must be in sequential order. By only using files in which the discharge we want is before the charge, and the charge we want is before the split, we select for files that exclusively have properly encoded data. Line breaks have been included (using \) for easier readibility and accessibility.

```python
for x in file.keys(): 
    if len(file[x].loc[file[x]['Sig'] == dischargestart].index.tolist()) >=3:
        if len(file[x].loc[file[x]['Sig'] == chargestart].index.tolist()) >=3:
            if len(file[x].loc[file[x]['Sig'] == split].index.tolist()) >=3:
                if file[x].loc[file[x]['Sig'] == dischargestart].index.tolist()[0]\
                < file[x].loc[file[x]['Sig'] == chargestart].index.tolist()[1]\
                < file[x].loc[file[x]['Sig'] == split].index.tolist()[2]:
```

Underneath this <font color="green">if</font> statement, we will assign our dictionaries their respective index values for each string they are made for, per file. By specifying with multi-indexing in pandas, we can specifically look for instances in which the column returns the string, then find the row with .<font color="blue">loc</font>, then <font color="green">print</font> the index for every return, and finally add each of these index in a list. We then use list indexing to specify which element we are looking for- we will go up in sequential order from the discharge from 0 because of the order and correspondance of each string. The final code should look something like this.



In [22]:
for x in file.keys(): 
    if len(file[x].loc[file[x]['Sig'] == dischargestart].index.tolist()) >=3:
        if len(file[x].loc[file[x]['Sig'] == chargestart].index.tolist()) >=3:
            if len(file[x].loc[file[x]['Sig'] == split].index.tolist()) >=3:
                if file[x].loc[file[x]['Sig'] == dischargestart].index.tolist()[0]\
                < file[x].loc[file[x]['Sig'] == chargestart].index.tolist()[1]\
                < file[x].loc[file[x]['Sig'] == split].index.tolist()[2]:
                        locsigdc[x] = (file[x].loc[file[x]['Sig'] == dischargestart].index.tolist()[0]) 
                        locsigch[x] = (file[x].loc[file[x]['Sig'] == chargestart].index.tolist()[1])
                        locsigsp[x] = (file[x].loc[file[x]['Sig'] == split].index.tolist()[2])
                        locsigend[x] = (file[x].loc[file[x]['Sig'] == split].index.tolist()[2] + 500)
                


### Truncating and Assigning our Cycle Data
To compare each of our cycles, we will truncate our data and reduce it down to just the voltage column for now. To easily access this data, we will assign each truncated dataframe to a corresponding dictionary entry.

First, we will create our dictionaries.

`dischargecycle` is the first discharge cycle.

In [23]:
dischargecycle = {} #first discharge

`splitchargecycle` is the first half of the charge cycle and the end of the discharge cycle.

In [24]:
splitchargecycle = {} #first half of the charge cycle

`postsplit` is the last half of the charge cycle, which then continues onto the start of the next discharge cycle.

In [25]:
postsplit = {} # last half of the charge cycle

Then, we will make a for loop iterating over the .<font color="blue">keys</font>() of locsigdc, which will make it to where we are only using the files that worked for our previous code. We will also include another <font color="green">try</font> statement to drop any broken files.

```python
for x in locsigdc.keys(): 
        try:
```
Next, we will assign each of our dictionaries the corresponding dataframe, shortened down to just the voltage column with multi-index, utilizing the .<font color="blue">truncate</font>() attribute to shorten our data to only the cycle being looked at. For each, we will define our `before` parameter being the start of the cycle, and the `after` parameter being the end of the cycle. 

```python
for x in locsigdc.keys(): 
        try:
            dischargecycle[x] = file[x]['Volt'].truncate(before=locsigdc[x],
                                                             after=locsigch[x])
            splitchargecycle[x] = file[x]['Volt'].truncate(before=locsigch[x], 
                                                               after=locsigsp[x])
            postsplit[x] = file[x]['Volt'].truncate(before=locsigsp[x],
                                                        after=locsigend[x])
```

To end the <font color="green">try</font> loop, we will include our <font color="green">except</font>
statement and a printed <font color="red">f-string</font> error message to show which ones are failing this step. All in all, our code should look like this.

In [26]:
for x in locsigdc.keys(): 
        try:
            dischargecycle[x] = file[x]['Volt'].truncate(before=locsigdc[x],
                                                             after=locsigch[x])
            splitchargecycle[x] = file[x]['Volt'].truncate(before=locsigch[x], 
                                                               after=locsigsp[x])
            postsplit[x] = file[x]['Volt'].truncate(before=locsigsp[x],
                                                        after=locsigend[x])
        except:
            print(f'fail in {x}')

## Comparing our Conditioned Data
By subtracting in a pointwise operation each cycle and finding the absolute value of their difference, we can find their overall deviation from eachother. Using both the discharge and charge cycle's absolute value of the difference, we can subtract both of them and take the absolute value to find the difference from eachother. We can then use this data to sort our resulting .csv file later on, but it also adds crucial context to paired matches.

First, we will create a <font color="green">for</font> loop that will allow us to iterate across the list of files used.

```python
for x in file.keys(): 
```

Next, we will include an <font color="green">if</font> statement to only select for keys that passed the previous loop to define the dictionaries. For this, we have used `splitchargecycle` arbitrarily.

```python
for x in file.keys(): 
    if x in splitchargecycle.keys():
```

To create our files, we will use a <font color="green">with</font> statement and contextlib to <font color="green">open</font> a file and write to it the print out of our code. The name of this file will be an <font color="red">f-string</font> to allow us to create individual files, numbered by file to which we will print our comparison code. We will define this open file <font color="green">as</font> `f`. 

```python
for x in file.keys(): 
    if x in splitchargecycle.keys():
        with open(f'ChargeComparison_of_{x}.csv', 'w') as f:
```

Next, we will use `redirect_stdout()` to direct the following code to `f`.

```python
for x in file.keys(): 
    if x in splitchargecycle.keys():
        with open(f'ChargeComparison_of_{x}.csv', 'w') as f:
            with redirect_stdout(f):
```

Within this, we will add another <font color="green">for</font> statement to cycle with an added variable, `y`, through the <font color="blue">keys</font> of `splitchargecycle`.

```python
for x in file.keys(): 
    if x in splitchargecycle.keys():
        with open(f'ChargeComparison_of_{x}.csv', 'w') as f:
            with redirect_stdout(f):
                for y in splitchargecycle.keys():
```

Below this, we will add an <font color="green">if</font> statement to only pass the print-out of the subtraction if `x` is greater than `y`. This allows to us to erase duplicates where the order is reversed. 

```python
for x in file.keys(): 
    if x in splitchargecycle.keys():
        with open(f'ChargeComparison_of_{x}.csv', 'w') as f:
            with redirect_stdout(f):
                for y in splitchargecycle.keys():
                    if x>y:
```

Adding to this, we will pass another <font color="green">if</font> statement that will only pass the print-out of the subtraction if the two files are not the same. This allows us to drop instances where the difference will automatically be zero, as these values provide no currently useable information and make sorting the data difficult.

```python
for x in file.keys(): 
    if x in splitchargecycle.keys():
        with open(f'ChargeComparison_of_{x}.csv', 'w') as f:
            with redirect_stdout(f):
                for y in splitchargecycle.keys():
                    if x>y:
                        if x != y:
```

Next, we will pass our <font color="green">print</font> statement. This <font color="green">print</font> statement is formatted in way in which when printed to this .csv, it will seperate each distinct subtraction into its own columns. This is done by inputting commas in between each subtraction. 

In order, each subtraction is as follows:  
A: The absolute value of the sum of the difference between the two __discharge__ cycles being used.  
B: The absolute value of the sum of the difference between the two __start of the charge__ cycles being used.  
C: The absolute value of the difference between values A and B.  
D: The number of file x.  
E: The number of file y.  

Altogether, the code should look something like this.

In [27]:
for x in file.keys(): 
    if x in splitchargecycle.keys():
        with open(f'ChargeComparison_of_{x}.csv', 'w') as f:
            with redirect_stdout(f):
                for y in splitchargecycle.keys():
                    if x>y:
                        if x != y:
                            print(abs(((dischargecycle[x].reset_index(drop=True)).sub(dischargecycle[y].reset_index(drop=True))).sum()), 
                                  ',',
                                  abs(((splitchargecycle[x].reset_index(drop=True)).sub(splitchargecycle[y].reset_index(drop=True))).sum()),
                                  ',',
                                  abs((abs(((dischargecycle[x].reset_index(drop=True)).sub(dischargecycle[y].reset_index(drop=True))).sum()))-(abs(((splitchargecycle[x].reset_index(drop=True)).sub(splitchargecycle[y].reset_index(drop=True))).sum()))),
                                  f',{x},', 
                                  f'{y}'
                                 )

### Creating our Dataframes
With our series of files containing our data, we will import them as dataframe entries in a dictionary to easily access and manipulate. To do this, we must first create our dictionary.

In [None]:
gofish = {}

Next, we will create a <font color="green">for</font> loop to iterate across the number of files created. We are using the <font color="blue">keys</font> of `splitchargecycle` to iterate because these are the numbers of files that have passed our cleaning.
```python
for x in splitchargecycle.keys():
```
Below this <font color="green">for</font> loop, we will read each file similarily to our original dictionaries using pd.<font color="blue">read_csv</font>(). We pass the name in an <font color="red"> f-string</font> to select each file in the <font color="blue">keys</font> we are iterating through, then we define each column being used for easy reference using the `names` argument.

In [28]:
for x in splitchargecycle.keys():
     gofish[x] = pd.read_csv(f'ChargeComparison_of_{x}.csv', 
                     names=['DisC_Difference', 'CharDifference', 'DisCharDiff', 'File', 'Compared'])


Because our 7th entry is empty, we will delete this entry.

In [None]:
del gofish[7] # our 7th key is empty

To test our `gofish` dictionary and make sure it properly encoded its information, we will <font color="green">print</font> the last file using x.

In [2]:
print(gofish[x])

NameError: name 'gofish' is not defined

### Acquiring and Printing the Minimums
By using another series of <font color="green">with</font> statements to open a file and write to it, we can print out the minimums of a specified column (the closest match in its value between the file number listed and whatever file number it is being compared to) then save all of those resulting minimums to its own .csv. 

Using a <font color="green">with</font> statement, we will <font color="green">open</font> our specified file and add that we are <font color="red">writing</font> to it with the <font color="red">'w'</font> argument. We then describe it <font color="green">as</font> `f` to use it in our next statement.

```python
with open('DisAndChargeMin.csv', 'w') as f:
```

Next, we will use `redirect_stdout()` to direct the following code to `f`.

```python
with open('DisAndChargeMin.csv', 'w') as f:
    with redirect_stdout(f):
```

Underneath these <font color="green">with</font> statements, we will then use a <font color="green">for</font> loop to iterate through the <font color="blue">keys</font> of `gofish` as x. 

```python
with open('DisAndChargeMin.csv', 'w') as f:
    with redirect_stdout(f):
        for x in gofish.keys():
```

Finally, we will print to our open file (<font color="red">'DisAndChargeMin.csv'</font>) the row <font color="blue">located</font> by the <font color="blue">index of the minimum</font> value of the absolute value of the sum of the difference between the two __discharge__ cycles being used for that row. This utilizes the <font color="red">'DisCharDiff'</font> column we previously described, but can be switched for any column based on what is being valued more. We then turn this row into a .csv or comma seperated value which will print to our already open .csv file.

In [30]:
with open('DisAndChargeMin.csv', 'w') as f:
    with redirect_stdout(f):
        for x in gofish.keys():
            print(gofish[x].reset_index(drop=True).loc[[gofish[x]['DisCharDiff'].reset_index(drop=True).idxmin(0)]].to_csv(index=False, header=False))

### Sorting our Collective Minimums
To adequately group our batteries, we will produce a .csv purposefully sorted beforehand for easy readability and accessibility.

First, we will create a dataframe using pd.<font color="blue">read_csv</font>(), importing our previously created file <font color="red">'DisAndChargeMin.csv'</font>, describing the same number of columns. We then use the <font color="blue">.sort_values</font>() attribute to sort by the <font color="red">'Compared'</font> column, specifying how this sorting will be done with the 3 arguments. `axis` describes by which axis we are sorting by (column or row), `ascending` is the order in which we will present the sort, and `inplace` edits the dataframe itself rather than creating a new one.

In [None]:
salmon = pd.read_csv('DisAndChargeMin.csv', names=['DisC_Difference', 'CharDifference', 'DisCharDiff', 'File', 'Compared']).sort_values(by=['Compared'], 
                                                                                               axis=0, 
                                                                                               ascending=True, 
                                                                                               inplace=False
                                                                                              )

Next, we will test our dataframe to make sure all values/columns were imported correctly and sorted.

In [None]:
print(salmon)

Finally, we will use the <font color="blue">.to_csv</font>() attribute of `pandas` to export this dataframe as a .csv file titled <font color="red">'DisChargeDIFFSorted.csv'</font>. We pass 3 arguments, `header` which describes whether or not to include the headers, `index` which describes whether or not we include a column for the index itself, and `mode` which describes what mode we are working in (read/write). 

In [31]:
salmon.to_csv('DisChargeDIFFSorted.csv', header=True, index=False, mode='w')

    DisC_Difference  CharDifference  DisCharDiff  File  Compared
0          0.334527        0.108508     0.226019     8         7
1          0.642332        0.233952     0.408380     9         7
5          0.089003        0.035232     0.053771    13         7
4          5.883620        0.316047     5.567573    12         7
48         0.121811        0.138411     0.016600    64         7
..              ...             ...          ...   ...       ...
55         0.544735        0.228576     0.316159    90        61
61         0.259613        0.262908     0.003295    96        63
62         0.246732        0.251713     0.004981    97        90
59         0.268889        0.336858     0.067969    94        91
60         0.261835        0.274535     0.012700    95        93

[64 rows x 5 columns]


With this, we have succeeded in __cleaning__, __aggregating__, __comparing__, and __sorting__ the Project NiCd battery files.

Below is a version of the print code that utilizes the entire cycle rather than just one.

In [None]:
fullcycle = {}
for x in dischargecycle.keys():
    if postsplit[x].max() < 1.5:
        if splitchargecycle[x].max() < 1.3:
            fullcycle[x] = pd.concat([dischargecycle[x].reset_index(drop=True), 
                                     splitchargecycle[x].reset_index(drop=True), 
                                     postsplit[x].reset_index(drop=True)], 
                                    ignore_index=True
                                   )
            fullcycle[x].plot(kind='line')
for x in fullcycle.keys():
    with open(f'FullCycleComparison#{x}.csv', 'w') as f:
        with redirect_stdout(f):
            for y in fullcycle.keys():
                if x>y:
                    if x!=y:
                        print(abs(((fullcycle[x].reset_index(drop=True)).sub(fullcycle[y].reset_index(drop=True))).sum()), f',{x}, ', f'{y}')
tetris={}
for x in fullcycle.keys():
     tetris[x] = pd.read_csv(f'FullCycleComparison#{x}.csv', 
                     names=['Difference', 'File', 'Compared']).drop_duplicates()
del tetris[7]
print(tetris)
with open('FullCycleCollectiveMin.csv', 'w') as f:
    with redirect_stdout(f):
        for x in tetris.keys():
            print(tetris[x].reset_index(drop=True).loc[[tetris[x]['Difference'].reset_index(drop=True).idxmin(0)]].to_csv(index=False, header=False))
emerald = pd.read_csv('FullCycleCollectiveMin.csv', names=['Difference', 'File', 'Compared']).sort_values(by=['Compared'], axis=0, ascending=True, inplace=False)
print(emerald)
emerald.to_csv('FullSorted.csv', header=True, index=False, mode='w')