# Magic Metro Bus

### Example notebook to demonstrate package capabilities

### Auto sorting the data

Ensure there is a folder titled 'Raw Data' in your current directory containing KCM-data-files.zip, then import the function:

In [1]:
import auto_sort_data

Checking for existing directories
Unzipping data


  0%|          | 0/556 [00:00<?, ?it/s]

Grouping files by bus


100%|██████████| 556/556 [12:59<00:00,  1.40s/it]


Discovering files for buses with module changes


100%|██████████| 174/174 [00:00<00:00, 241179.41it/s]


Moving files to new directory


100%|██████████| 174/174 [00:00<00:00, 331731.32it/s]
100%|██████████| 67/67 [00:00<00:00, 102.23it/s]


You should now have 2 new folders: 'sorted_data' containing bus data with no module changes and 'vis_buses' containing data with modules that have been swapped.

## Building dataframes from the csv files

Directly reading the csv files into a dataframe formats the data in an impractical way:

In [3]:
import pandas as pd

examp_file = auto_sort_data.find_directory() + 'vis_buses/' + 'bus_1/' + '0016_ProfileData_20180731061359.csv'
df = pd.read_csv(examp_file)

df

Unnamed: 0,Unnamed: 1.1,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 65,Unnamed: 66,Unnamed: 67,Unnamed: 68,Unnamed: 69,Unnamed: 70,Unnamed: 71,Unnamed: 72,Unnamed: 73,Unnamed: 74
0,ESS BMS and Battery Module Profile Data,,,,,,,,,,...,,,,,,,,,,
1,,,,,,,,,,,...,,,,,,,,,,
2,Data retrieved: 07/31/2018 @ 06:14:11,,,,,,,,,,...,,,,,,,,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,BMS Software AS version:,0x0206BA42,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
954,,,,,,,,,,,...,,,,,,,,,,
955,<Module Voltages (V)>,,,,,,,,,,...,,,,,,,,,,
956,,,,,,,,,,,...,,,,,,,,,,
957,,< 24.0,24,26.4,28.8,31.2,33.6,36,38.4,40.8,...,,,,,,,,,,


Our data building functions allow the csvs to be parsed into meaningful dataframes for statistical analysis and visualizations.

In [1]:
import build_data_vis

In [2]:
directory = build_data_vis.find_directory() + 'vis_buses/'

The sort_bus_by_date function creates a dataframe that orders the file names for a given bus by date retrieved.

In [11]:
df_bus_dates = build_data_vis.sort_bus_by_date(directory, 'bus_1/')
df_bus_dates

Unnamed: 0,Filename,DateRetrieved
0,13J0016_ProfileData_20170920063536.csv,2017-09-20 06:36:19
1,!3J0018_ProfileData_20170920082828.csv,2017-09-20 08:29:04
2,13J0016_ProfileData_20180320103531.csv,2018-03-20 10:36:30
3,0016_ProfileData_20180731061359.csv,2018-07-31 06:14:11
4,0016_ProfileData_20180731063048.csv,2018-07-31 06:30:55
5,0016_ProfileData_20180731064002.csv,2018-07-31 06:40:09
6,13J0016_ProfileData_20180920063036.csv,2018-09-20 06:30:57
7,13J0016_ProfileData_20180920063914.csv,2018-09-20 06:39:31
8,13j0016_ProfileData_20180920064815.csv,2018-09-20 06:48:27
9,13j0016_ProfileData_20180920070233.csv,2018-09-20 07:02:46


The build_bus_df takes as input a directory, a string of the desired bus directory, and a keyword (selected from 'Current', 'Voltage', or 'Power'. The function returns a dataframe of the entire bus current, voltage, or power over time, with the order of the rows being sequential by date.

In [12]:
build_data_vis.build_bus_df(directory, 'bus_1/', 'Voltage')

Unnamed: 0,< 450.0,450,460,470,480,490,500,510,520,530,...,670,680,690,700,710,720,730,740,>= 750.0,TOTAL
0,348,0,0,15,69,274,942,4161,26892,78020,...,2096893,1733041,938116,530859,91264,558,0,0,0,80985403
1,348,0,0,15,69,274,942,4161,26892,78020,...,2096893,1733041,938116,530859,91264,558,0,0,0,80985877
2,348,0,0,17,69,274,942,4161,26940,78589,...,2258399,1852922,1000410,540306,91636,564,0,0,0,86617528
3,348,0,0,17,69,274,942,4161,26940,78592,...,2339863,1931360,1041513,550848,92350,568,0,0,0,90059016
4,348,0,0,17,69,274,942,4161,26940,78592,...,2339863,1931360,1041513,550848,92350,568,0,0,0,90059338
5,348,0,0,17,69,274,942,4161,26940,78592,...,2339863,1931360,1041513,550848,92350,568,0,0,0,90059627
6,348,0,0,17,69,274,942,4161,26940,78604,...,2366050,1953026,1051446,553265,92491,568,0,0,0,91259903
7,348,0,0,17,69,274,942,4161,26940,78604,...,2366050,1953026,1051446,553265,92491,568,0,0,0,91260309
8,348,0,0,17,69,274,942,4161,26940,78604,...,2366050,1953026,1051446,553265,92491,568,0,0,0,91260309
9,348,0,0,17,69,274,942,4161,26940,78604,...,2366050,1953026,1051446,553265,92491,568,0,0,0,91261149


The build_module_df function takes as input a directory, a string of the desired bus directory, and an integer module number (can be between 1 and 16). The function returns a dataframe of the module voltages with the rows sequential in time. Note that the module data within the csv files is further subdivided into 12 submodules. Therefore, this function will output 12 rows for each date retrieved within the bus folder. For example, running this function on bus 1 module 1 should return a dataframe with 216 rows. Rows 0-11 correspond to the first date in the bus folder, then rows 12-24 to the next date, etc.

In [13]:
build_data_vis.build_module_df(directory, 'bus_1/', 1)

Unnamed: 0,< 2.0,2,2.2,2.4,2.6,2.8,3,3.2,3.4,3.6,3.8,>= 4.0,TOTAL
0,0,0,0,0,127,226879,2601804,13728787,2060319,81412,0,0,18699328
1,0,0,0,0,12,110505,2566545,14090318,1914435,17513,0,0,18699328
2,0,0,0,0,369,332949,2613591,13485229,2098514,168676,0,0,18699328
3,0,0,0,0,24,125100,2576091,14042824,1934353,20936,0,0,18699328
4,0,0,0,0,539,359234,2613940,13427657,2105483,192475,0,0,18699328
...,...,...,...,...,...,...,...,...,...,...,...,...,...
211,0,0,0,0,510,190512,4083068,21651787,3054384,60660,0,0,29040921
212,0,0,0,0,1194,538432,4145804,20755450,3246999,353041,1,0,29040921
213,0,0,0,0,563,213689,4103070,21570151,3083451,69997,0,0,29040921
214,0,0,0,0,952,251073,4121799,21461983,3118184,86930,0,0,29040921


The build_module_df function is similar to the above function, but it averages the submodule data together to return only one row for each date for the module specified. This allows for easier visualization of voltages on the module level.

In [14]:
build_data_vis.build_module_average_df(directory, 'bus_1/', 1)

Unnamed: 0_level_0,< 2.0,2,2.2,2.4,2.6,2.8,3,3.2,3.4,3.6,3.8,>= 4.0,TOTAL
DateRetrieved,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2017-09-20 06:36:19,0.0,0.0,0.0,0.0,204.0,218366.0,2592358.0,13789010.0,2013729.0,85666.083333,0.0,0.0,18699328.0
2017-09-20 08:29:04,0.0,0.0,0.0,0.0,204.0,218366.0,2592358.0,13789480.0,2013729.0,85666.083333,0.0,0.0,18699802.0
2018-03-20 10:36:30,0.0,0.0,0.0,0.0,835.25,307839.166667,3407022.0,17619430.0,2651823.0,137055.0,0.083333,0.0,24124005.0
2018-07-31 06:14:11,0.0,0.0,0.0,0.0,856.166667,333373.916667,3931136.0,20010170.0,3011958.0,185874.916667,0.416667,0.0,27473366.0
2018-07-31 06:30:55,0.0,0.0,0.0,0.0,856.166667,333373.916667,3931136.0,20010490.0,3011958.0,185874.916667,0.416667,0.0,27473686.0
2018-07-31 06:40:09,0.0,0.0,0.0,0.0,856.166667,333373.916667,3931136.0,20010770.0,3011958.0,185874.916667,0.416667,0.0,27473973.0
2018-09-20 06:30:57,0.0,0.0,0.0,0.0,913.25,361809.333333,4078580.0,20845170.0,3129409.0,197102.083333,0.416667,0.0,28612985.0
2018-09-20 06:39:31,0.0,0.0,0.0,0.0,913.25,361809.333333,4078580.0,20845580.0,3129409.0,197102.083333,0.416667,0.0,28613390.0
2018-09-20 06:48:27,0.0,0.0,0.0,0.0,913.25,361809.333333,4078580.0,20845580.0,3129409.0,197102.083333,0.416667,0.0,28613390.0
2018-09-20 07:02:46,0.0,0.0,0.0,0.0,913.25,361809.333333,4078580.0,20845580.0,3129409.0,197102.083333,0.416667,0.0,28613390.0


The count_mod_changes function takes as an input a directory of data sorted by bus. The function will then return a dataframe showing the bus and module number as well as the number of times the module has been changed.

In [15]:
build_data_vis.count_mod_changes(directory)

Unnamed: 0,Bus,Module,Date,Change
0,1,1,"09/20/2017, 06:36:19",0
1,1,1,"09/20/2017, 08:29:04",0
2,1,1,"03/20/2018, 10:36:30",0
3,1,1,"07/31/2018, 06:14:11",0
4,1,1,"07/31/2018, 06:30:55",0
...,...,...,...,...
59,98,15,"05/30/2018, 08:13:37",1
60,98,16,"09/18/2017, 06:43:27",0
61,98,16,"09/18/2017, 10:52:47",1
62,98,16,"02/26/2018, 06:40:43",1


The find_replaced_modules function takes as an input a directory of data sorted by bus. The function will return a dictionary with the bus number as the key and the serial number of the modules that have full lifetime data as the values.

In [3]:
build_data_vis.find_replaced_modules(directory)

{'bus_1': [' 0039-1811-00U-402335-00804......'],
 'bus_10': [' 0039-1638-0HC-402335-00802..&...'],
 'bus_154': [" 0039-1635-06T-402335-00802..'..."],
 'bus_2': [' 0039-1522-04F-402335-00710......',
  ' 0039-1602-0L2-402335-00710......',
  ' 0039-0951-0DM-402335-00108..&...',
  ' 0039-1641-0P2-402335-00802..)...',
  ' 364A1585G3REVA-14A3920..........',
  ' 0039-1632-0UG-402335-00710.. ...',
  ' 0039-1641-0PD-402335-00802..)...',
  ' 364A1585G3REVA-14A3924..........',
  ' 0039-0951-0HB-402335-00108..&...',
  ' 0039-1632-0UH-402335-00710.. ...',
  ' 0039-1630-0LT-402335-00710......',
  ' 0039-1632-0UP-402335-00710.. ...',
  ' 364A1585G3REVA-14A3925..........',
  ' 364A1585G3REVA-14A3919..........',
  ' 364A1585G3REVA-14A3917..........',
  ' 0039-1632-0UL-402335-00710.. ...',
  ' 0039-1630-0LW-402335-00710......',
  ' 0039-1630-0LV-402335-00710......',
  ' 364A1583G100-13L0049............',
  ' 0039-0951-0DP-402335-00108..&...',
  ' 0039-1630-0M2-402335-00710......',
  ' 0039-1632-0UK-4023

The swapped_module_dataframes function takes as an input a directory, a module serial number, and a keyword. The function then returns a dataframe of the input module characteristic. For example, inputting your directory with the serial number for bus 1 module 1 and the keyword 'balancers' will return a dataframe of the cell balancer data for that module, with the rows sequential by time.

In [6]:
build_data_vis.swapped_mod_dataframes(directory, ' 0039-1811-00U-402335-00804......', 'balancers')

[                                            Off      On    TOTAL
 <Cell Balancers>  09/20/2018, 06:30:57                          
 CELL 1                                  1030246  106943  1137189
 CELL 2                                  1028063  109126  1137189
 CELL 3                                  1029071  108118  1137189
 CELL 4                                  1027367  109822  1137189
 CELL 5                                  1028041  109148  1137189
 CELL 6                                  1026996  110193  1137189
 CELL 7                                  1028023  109166  1137189
 CELL 8                                  1025681  111508  1137189
 CELL 9                                  1026528  110661  1137189
 CELL 10                                 1023558  113631  1137189
 CELL 11                                 1022981  114208  1137189
 CELL 12                                 1025729  111460  1137189,
                                             Off      On    TOTAL
 <Cell Ba

## Visualizing the parsed data:

Our visualization functions take the parsed data from our data building functions to help visualize trends in swapped modules.

visualize_mod_changes: This function uses the count_mod_changes output to produce a heat map for all buses indicating when the modules have been changed (with a color change indicating the module has been changed). The heat map includes a drop down menu to select the desired bus.

In [7]:
build_data_vis.visualise_mod_changes(directory)

visualize_mod_time: This function uses the build_module_average_df output to visualize the distribution of time spent at each voltage in the voltage range for a given module. For example, running this function with the input of bus 1 and module 1 will return a graph with 12 plotted lines, one for each individual date in bus one, where the x axis is voltage and the y axis is time in seconds. A dropdown menu is available on the graph to select a specific date. The selected date will remain in color while the other dates will be rendered gray. The axes are also scalable by clicking and dragging and using your mouse scroll.

In [10]:
build_data_vis.visualize_mod_time(directory, 'bus_1/', 11)