
# Understanding Pandas Series and DataFrames

## Introduction

In this lesson, we're digging into Pandas Series and DataFrames - the two main data types you'll work with.

## Objectives
You will be able to:
- Understand and explain what Pandas Series and DataFrames are and how they differ from dictionaries and lists
- Create Series & DataFrames from dictionaries and lists
- Manipulate columns in DataFrames (`df.rename()`, `df.drop()`) 
- Manipulate the index in DataFrames (`df.reindex()`, `df.drop()`, `df.rename()`) 
- Manipulate column datatypes 

## Pandas Data Types vs. Native Python Data Types

As we talk more about Object-Oriented Programming (OOP), using Pandas Series and DataFrames instead of built-in Python datatypes can have a range of benefits. One of the most important benefit is that Series and DataFrames have a range of built-in methods which make standard practices and procedures streamlined. Some of these methods can result in dramatic performance gains. To read more about these methods, make sure to continuously reference the [Pandas documentation](https://pandas.pydata.org/pandas-docs/stable/). It is impossible to know every method of pandas at any given time, nor should you devote much time to memorization. We will not deeply explain every Pandas method in these upcoming lessons and labs, but a critical part of every Data Scientist's job is to investigate documentation to learn about components of these tools on your own.


**From the Pandas documentation:**

**pandas** is everyone's favorite data analyis library providing fast, flexible, and expressive data structures designed to work with *relational* or table-like data (SQL table or Excel spreadsheet). It is a fundamental high-level building block for doing practical, real world data analysis in Python. 

pandas is well suited for:

- Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet
- Ordered and unordered (not necessarily fixed-frequency) time series data.
- Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels
- Any other form of observational / statistical data sets. The data actually need not be labeled at all to be placed into a pandas data structure

The two primary data structures of pandas, **Series** (1-dimensional) and **DataFrame** (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. Pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.

<p>Here are just a few of the things that pandas does well:</p>
<blockquote>
<div><ul class="simple">
<li>Easy handling of <strong>missing data</strong> (represented as NaN) in floating point as
well as non-floating point data</li>
<li>Size mutability: columns can be <strong>inserted and deleted</strong> from DataFrame and
higher dimensional objects</li>
<li>Automatic and explicit <strong>data alignment</strong>: objects can be explicitly
aligned to a set of labels, or the user can simply ignore the labels and
let <cite>Series</cite>, <cite>DataFrame</cite>, etc. automatically align the data for you in
computations</li>
<li>Powerful, flexible <strong>group by</strong> functionality to perform
split-apply-combine operations on data sets, for both aggregating and
transforming data</li>
<li>Make it <strong>easy to convert</strong> ragged, differently-indexed data in other
Python and NumPy data structures into DataFrame objects</li>
<li>Intelligent label-based <strong>slicing</strong>, <strong>fancy indexing</strong>, and <strong>subsetting</strong>
of large data sets</li>
<li>Intuitive <strong>merging</strong> and <strong>joining</strong> data sets</li>
<li>Flexible <strong>reshaping</strong> and pivoting of data sets</li>
<li><strong>Hierarchical</strong> labeling of axes (possible to have multiple labels per
tick)</li>
<li>Robust IO tools for loading data from <strong>flat files</strong> (CSV and delimited),
Excel files, databases, and saving / loading data from the ultrafast <strong>HDF5
format</strong></li>
<li><strong>Time series</strong>-specific functionality: date range generation and frequency
conversion, moving window statistics, moving window linear regressions,
date shifting and lagging, etc.</li>
</ul>
</div></blockquote>
<p>Many of these principles are here to address the shortcomings frequently
experienced using other languages / scientific research environments. For data
scientists, working with data is typically divided into multiple stages:
munging and cleaning data, analyzing / modeling it, then organizing the results
of the analysis into a form suitable for plotting or tabular display. pandas
is the ideal tool for all of these tasks.</p>

# Introducing the most important objects: Series and DataFrames

## Setup

Let's take a little time to import the packages we need and to import and preview a dataset.

In [1]:
# importing the convention ie pandas with an alias pd so we dont have to write pandas always
import pandas as pd


## The Pandas Series

The **Series** data structure in Pandas is a <i>one-dimensional labeled array</i>. 

* Data in the array can be of any type (integers, strings, floating point numbers, Python objects, etc.). 
* Data within the array is homogeneous
* Pandas Series objects always have an index: this gives them both ndarray-like and dict-like properties.
    
<img src="../images/pandas_series1.jpg">

# Creating a Pandas Series

There are many ways to create a Pandas Series objects, some of the most common ways are:
- Creation from a list
- Creation from a dictionary
- Creation from a ndarray
- From an external source like a file

In [2]:
# define the data and index as lists
temperature = [40, 29, 15, 19, 11, -15, 9]
days = ['Mon','Tue','Wed','Thu','Fri','Sat','Sun']

# create series 
series_from_list = pd.Series(temperature, index=days)
series_from_list

Mon    40
Tue    29
Wed    15
Thu    19
Fri    11
Sat   -15
Sun     9
dtype: int64

### From a Dictionary

In [3]:
my_dict = {'Mon': 40, 'Tue': 29, 'Wed': 15, 'Thu': 19, 'Fri': 11, 'Sat': -15, 'Sun': 9}
series_from_dict = pd.Series(my_dict)
series_from_dict

Mon    40
Tue    29
Wed    15
Thu    19
Fri    11
Sat   -15
Sun     9
dtype: int64

<img src="../images/pandas_series2.jpg">

### From a numpy array

In [7]:
import numpy as np
my_array = np.linspace(0,10,15)
series_from_ndarray = pd.Series(my_array)
series_from_ndarray

0      0.000000
1      0.714286
2      1.428571
3      2.142857
4      2.857143
5      3.571429
6      4.285714
7      5.000000
8      5.714286
9      6.428571
10     7.142857
11     7.857143
12     8.571429
13     9.285714
14    10.000000
dtype: float64

# Vectorized operations also work in pandas Series

In [8]:
np.exp(series_from_list)

Mon    2.353853e+17
Tue    3.931334e+12
Wed    3.269017e+06
Thu    1.784823e+08
Fri    5.987414e+04
Sat    6.737947e-03
Sun    8.103084e+03
dtype: float64

# Pandas DataFrames

### We have the min and max temperatures in a city in London for each months of the year. We would like to find a function to describe this and show it graphically , the dataset given below . 

In [34]:
df_new = pd.DataFrame({'Max' : [39,41,43,47,49,51,45,38,37,29,27,25], 
                       'Min': [21,23,27,28,32,35,31,28,21,19,17,18]})
df_new.head()

Unnamed: 0,Max,Min
0,39,21
1,41,23
2,43,27
3,47,28
4,49,32


DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects.

<img src="../images/dataframe1.jpg">

You can create a DataFrame from:

* Dict of 1D ndarrays, lists, dicts, or Series
* 2-D numpy.ndarray
* From text, CSV, Excel files or databases
* Many other ways

Here's an example where we have set the Dates column to be the index and label for the rows. 

<img src="../images/dataframe2.jpg">

# The above image can be represented as a pandas DataFrame as below

In [4]:
df_new = pd.DataFrame({"Dates" : ["12-1" ,"12-2","12-3","12-4","12-5","12-6","12-7"],
                       'Tokyo' : [15,19,15,11,9,8,13], 
                       'Paris': [-2,0,2,5,7,-5,-3],
                       "Mumbai" : [20,18,23,19,25,27,23]
                      })
df_new.head()

Unnamed: 0,Dates,Tokyo,Paris,Mumbai
0,12-1,15,-2,20
1,12-2,19,0,18
2,12-3,15,2,23
3,12-4,11,5,19
4,12-5,9,7,25


# Let's Create a pandas Dataframe with details of flight infornation

In [38]:
data = pd.DataFrame({"Dates" : ["12-1" ,"12-2","12-3","12-4","12-5"],
                    "Airline" : ["KLM" , "AirFrance" ,"SwissAir" ,"RyanAir","Emirates"],
                     
                    "Departure" : ["Tokyo","Madrid", "Mumbai" , "London" , "NewYork"],
                     
                     "Arrival" : ["Paris","Milan","Stockholm","Brussels" , "Accra"],
                     
                     "FlightNumber" : [10045,10050,10065,10070,10080],
                     
                     "RecentDelays" : ["23-47 hours","10-14 hours","4-18 hours","13 hours",
                                       "20-30 hours"],
                     })
data

Unnamed: 0,Dates,Airline,Departure,Arrival,FlightNumber,RecentDelays
0,12-1,KLM,Tokyo,Paris,10045,23-47 hours
1,12-2,AirFrance,Madrid,Milan,10050,10-14 hours
2,12-3,SwissAir,Mumbai,Stockholm,10065,4-18 hours
3,12-4,RyanAir,London,Brussels,10070,13 hours
4,12-5,Emirates,NewYork,Accra,10080,20-30 hours


# Renaming columns

In [39]:
data.rename(columns={'RecentDelays':"Delays"} , inplace=True) # set inplace to true

In [None]:
# Let's check our result and see if RecentDelays has been changed to delays

In [40]:
data.head()

Unnamed: 0,Dates,Airline,Departure,Arrival,FlightNumber,Delays
0,12-1,KLM,Tokyo,Paris,10045,23-47 hours
1,12-2,AirFrance,Madrid,Milan,10050,10-14 hours
2,12-3,SwissAir,Mumbai,Stockholm,10065,4-18 hours
3,12-4,RyanAir,London,Brussels,10070,13 hours
4,12-5,Emirates,NewYork,Accra,10080,20-30 hours


In [45]:
data.reset_index(inplace=True) # Reset the index to the default
                # the old index is added as a column, and a
                # new sequential index is used

In [47]:
data

Unnamed: 0,index,Dates,Airline,Departure,Arrival,FlightNumber,Delays
0,0,12-1,KLM,Tokyo,Paris,10045,23-47 hours
1,1,12-2,AirFrance,Madrid,Milan,10050,10-14 hours
2,2,12-3,SwissAir,Mumbai,Stockholm,10065,4-18 hours
3,3,12-4,RyanAir,London,Brussels,10070,13 hours
4,4,12-5,Emirates,NewYork,Accra,10080,20-30 hours


# Dropping columns

In [48]:
# note axis 1 is our columns and axis 0 are our rows
data.drop("FlightNumber" ,axis=1 ,inplace=True) 

In [49]:
# dropping the new index column too
data.drop("index", axis=1 , inplace=True)

# Checking the final result
+ We can now see the columns have been removed from the DataFrame

In [50]:
data

Unnamed: 0,Dates,Airline,Departure,Arrival,Delays
0,12-1,KLM,Tokyo,Paris,23-47 hours
1,12-2,AirFrance,Madrid,Milan,10-14 hours
2,12-3,SwissAir,Mumbai,Stockholm,4-18 hours
3,12-4,RyanAir,London,Brussels,13 hours
4,12-5,Emirates,NewYork,Accra,20-30 hours


# Let's Dive Deeper - Now that you get the point

![yes](https://media.giphy.com/media/5LcfoE5u34kfNvW1Oi/giphy.gif)

In [51]:
import os
cwd = os.getcwd()
cwd

'/Users/flatironschool/Desktop/iNueron/Introduction-to-Python/Part 6 - Pandas/02.Pandas_Series_and_Dataframe'

**Loop through the files in your working directory and display their names** 

In [56]:
files = [f for f in os.listdir('.') if os.path.isfile(f)]
for f in files:
    print (f)

02.Series&DataFrames_Exercise.ipynb
01.Intro-Pandas-Series-and-DataFrames.ipynb


In [59]:
df = pd.read_csv("../data/turnstile.txt")
df.head()

Unnamed: 0,C/A,UNIT,SCP,STATION,LINENAME,DIVISION,DATE,TIME,DESC,ENTRIES,EXITS
0,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,00:00:00,REGULAR,6736067,2283184
1,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,04:00:00,REGULAR,6736087,2283188
2,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,08:00:00,REGULAR,6736105,2283229
3,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,12:00:00,REGULAR,6736180,2283314
4,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,16:00:00,REGULAR,6736349,2283384


In [60]:
# getting information on the data. From here we can find out a lot about what we are dealing with
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 197625 entries, 0 to 197624
Data columns (total 11 columns):
 #   Column                                                                Non-Null Count   Dtype 
---  ------                                                                --------------   ----- 
 0   C/A                                                                   197625 non-null  object
 1   UNIT                                                                  197625 non-null  object
 2   SCP                                                                   197625 non-null  object
 3   STATION                                                               197625 non-null  object
 4   LINENAME                                                              197625 non-null  object
 5   DIVISION                                                              197625 non-null  object
 6   DATE                                                                  197625 non-null  objec

## Data Munging/ Manipulation
This MTA turnstile dataset is a great place for us to get our hands dirty wrangling and cleaning some data! Here's the data dictionary if you want to know more about the dataset http://web.mta.info/developers/resources/nyct/turnstile/ts_Field_Description.txt  

Let's start by filtering the data down to all stations for the N line. To do this, we'll need to extract all "N"s from the LINENAME column, or create a column indicating whether or not the stop is an N line stop.

### Defining Functions

At this point, we will need to define some functions to perform data manipulation so that we can reuse them easily. Let's review how to do this: In Python, we define a function using the `def` keyword. Afterwards, we give the function a name, followed by parentheses. Any required (or optional parameters) are specified within the parentheses, just as you would normally call a function. You then specify the function's behavior using a colon and an indentation, much the same way you would a `for` loop or conditional block. Finally, if you want your function to return something (as with the `str.pop()` method) as opposed to a function that simply does something in the background but returns nothing (such as `list.append()`), you must use the `return` keyword. Note that as soon as a function hits a point in execution where something is returned, the function would terminate and no further commands would be executed. In other words the `return` command both returns a value and forces termination of the function.

In [61]:
def contains_n(text):
    if 'N' in text:
        return True
    else:
        return False

#or the shorter, more pythonic:
def contains_n(text):
    bool_val = 'N' in text
    return bool_val

## Explanation
Above we used the `.map()` method for Pandas series. This allows us to pass a function that will be applied to each and every data entry within the series. As shorthand, we could also pass a lambda function to determine whether or not each row was on the N line or not: 

`df['On_N_Line'] = df.LINENAME.map(lambda x: 'N' in x)` 


This is shorter and equivalent to the functions defined above. Lambda functions are often more convenient, but have less functionality than defining functions explicitly.

In [62]:
# Creating the new column for all linenames containing N
df["On_N_Line"] = df.LINENAME.map(contains_n)

In [70]:
print("These are the first two columns of the dataframe")

display(df.head(2))
print("-----------------------------------------------------------------------------------")

print("These are the last two columns of the dataframe ")
df.tail(2)

These are the first two columns of the dataframe


Unnamed: 0,C/A,UNIT,SCP,STATION,LINENAME,DIVISION,DATE,TIME,DESC,ENTRIES,EXITS,On_N_Line
0,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,00:00:00,REGULAR,6736067,2283184,True
1,A002,R051,02-00-00,59 ST,NQR456W,BMT,08/25/2018,04:00:00,REGULAR,6736087,2283188,True


-----------------------------------------------------------------------------------
These are the last two columns of the dataframe 


Unnamed: 0,C/A,UNIT,SCP,STATION,LINENAME,DIVISION,DATE,TIME,DESC,ENTRIES,EXITS,On_N_Line
197623,TRAM2,R469,00-05-01,RIT-ROOSEVELT,R,RIT,08/31/2018,17:00:00,REGULAR,5554,348,False
197624,TRAM2,R469,00-05-01,RIT-ROOSEVELT,R,RIT,08/31/2018,21:00:00,REGULAR,5554,348,False


In [None]:
# Let's check out our new column

In [71]:
df.On_N_Line.value_counts(normalize=True)

False    0.870441
True     0.129559
Name: On_N_Line, dtype: float64

*If you have not seen `value_counts()` before, this would be a good time to check out the [documentation for it](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.value_counts.html) !*

## Cleaning Column Names
Sometimes, you have messy column names.

In [72]:
df.columns

Index(['C/A', 'UNIT', 'SCP', 'STATION', 'LINENAME', 'DIVISION', 'DATE', 'TIME',
       'DESC', 'ENTRIES',
       'EXITS                                                               ',
       'On_N_Line'],
      dtype='object')

You might notice that, foolishly, the `EXITS` column has a lot of annoying whitespace following it.
We can quickly use a list comprehension to clean up all of the column names.

In [78]:
new = [x.replace(" ","") for x in df.columns] # i replace the whitespace in the columns
df.columns = new

In [79]:
# Checking the results
df.columns

Index(['C/A', 'UNIT', 'SCP', 'STATION', 'LINENAME', 'DIVISION', 'DATE', 'TIME',
       'DESC', 'ENTRIES', 'EXITS', 'On_N_Line'],
      dtype='object')

# Dealing with Column Types

+ Another common data munging technique can be reformatting column types. We first previewed column types above using the `df.info()` method, which we'll repeat here.

In [80]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 197625 entries, 0 to 197624
Data columns (total 12 columns):
 #   Column                                                                Non-Null Count   Dtype 
---  ------                                                                --------------   ----- 
 0   C/A                                                                   197625 non-null  object
 1   UNIT                                                                  197625 non-null  object
 2   SCP                                                                   197625 non-null  object
 3   STATION                                                               197625 non-null  object
 4   LINENAME                                                              197625 non-null  object
 5   DIVISION                                                              197625 non-null  object
 6   DATE                                                                  197625 non-null  objec

A common transformation needed is converting numbers stored as text to *float* or *integer* representations. In this case `ENTRIES` and `EXITS` are appropriately *int64*, but to practice, we'll demonstrate changing that to a float and then back to an int.

In [81]:
print(df.ENTRIES.dtype) #We can also check an individual column type rather then all 


int64


In [82]:
 # Changing the column data tyoe to float
df.ENTRIES = df.ENTRIES.astype(float)
print(df.ENTRIES.dtype) # Checking our changes

float64


In [83]:
# Converting Back to integer
print(df.ENTRIES.dtype) 
df.ENTRIES = df.ENTRIES.astype(int)
print(df.ENTRIES.dtype)

float64
int64


# <font style="color:red;">**Warning : Running the code below will return an error**</font>

Attempting to convert a string column to int or float will produce **errors** if there are actually non-numeric characters

In [None]:
df.LINENAME = df.LINENAME.astype(int) # this will return an error as linename is a string

# Here is really good resource for learning how to deal with datetime :
https://www.dataquest.io/blog/python-datetime-tutorial/

## Converting Dates
A slightly more complicated data type transformation is creating *date* or *datetime* objects. These are built-in datatypes that have useful information such as being able to quickly calculate the time between two days, or extracting the day of the week from a given date. However, if we look at our current date column, we will notice it is simply a *non-null object* (probably simply text).

In [84]:
df.DATE

0         08/25/2018
1         08/25/2018
2         08/25/2018
3         08/25/2018
4         08/25/2018
             ...    
197620    08/31/2018
197621    08/31/2018
197622    08/31/2018
197623    08/31/2018
197624    08/31/2018
Name: DATE, Length: 197625, dtype: object

In [85]:
# Checking the data type of the dates column
df.DATE.dtype

dtype('O')

## `pd.to_datetime()`
This is the handiest of methods when converting strings to datetime objects.

In [86]:
# Often you can simply pass the series into this method.
pd.to_datetime(df.DATE).head() #It is good practice to preview the results first
# This prevents overwriting data if some error was produced. However everything looks good!

0   2018-08-25
1   2018-08-25
2   2018-08-25
3   2018-08-25
4   2018-08-25
Name: DATE, dtype: datetime64[ns]

# Note : 

Sometimes the above won't work and you'll have to explicitly pass how the date is formatted.  
To do that, you have to use some datetime codes. Here's a preview of some of the most common ones:  
<img src="../images/datetime.png" width=600>

In [87]:
# Notice we include delimiters (in this case /) between the codes for month,day,year
pd.to_datetime(df.DATE, format='%m/%d/%Y').head()

0   2018-08-25
1   2018-08-25
2   2018-08-25
3   2018-08-25
4   2018-08-25
Name: DATE, dtype: datetime64[ns]

# Applying and saving changes

In [88]:
df["DATE"] = pd.to_datetime(df.DATE)
print(df.DATE.dtype)   # notice its changes from (o) to datetime

datetime64[ns]


In [89]:
# check result
df.head(3)

Unnamed: 0,C/A,UNIT,SCP,STATION,LINENAME,DIVISION,DATE,TIME,DESC,ENTRIES,EXITS,On_N_Line
0,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,00:00:00,REGULAR,6736067,2283184,True
1,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,04:00:00,REGULAR,6736087,2283188,True
2,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,08:00:00,REGULAR,6736105,2283229,True


# Datetime Methods

+ Now that we have converted the `DATE` field to a datetime object we can use some handy built-in methods.

In [90]:
#dt stores all the built in datetime methods (only works for datetime columns)
df.DATE.dt.day_name().head()

0    Saturday
1    Saturday
2    Saturday
3    Saturday
4    Saturday
Name: DATE, dtype: object

In [92]:
df.DATE.dt.month_name().head()

0    August
1    August
2    August
3    August
4    August
Name: DATE, dtype: object

## Renaming Columns
You can rename columns using dictionaries as follows:

In [93]:
df = df.rename(columns={"DATE": "date"})
# check result
df.head()

Unnamed: 0,C/A,UNIT,SCP,STATION,LINENAME,DIVISION,date,TIME,DESC,ENTRIES,EXITS,On_N_Line
0,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,00:00:00,REGULAR,6736067,2283184,True
1,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,04:00:00,REGULAR,6736087,2283188,True
2,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,08:00:00,REGULAR,6736105,2283229,True
3,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,12:00:00,REGULAR,6736180,2283314,True
4,A002,R051,02-00-00,59 ST,NQR456W,BMT,2018-08-25,16:00:00,REGULAR,6736349,2283384,True


# Dropping columns

#### If you don't pass the axis=1 parameter, pandas will try and drop a row with the specified index

In [94]:
df2 = df.copy()

In [95]:
df.drop("C/A", axis=1 , inplace=True) # you can also simply write df = df.drop("C/A",1,inplace=True)

## Setting a New Index
+ It can also be helpful to set an index such as when graphing.

In [96]:
df = df.set_index("date")
df.head()

Unnamed: 0_level_0,C/A,UNIT,SCP,STATION,LINENAME,DIVISION,TIME,DESC,ENTRIES,EXITS,On_N_Line
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2018-08-25,A002,R051,02-00-00,59 ST,NQR456W,BMT,00:00:00,REGULAR,6736067,2283184,True
2018-08-25,A002,R051,02-00-00,59 ST,NQR456W,BMT,04:00:00,REGULAR,6736087,2283188,True
2018-08-25,A002,R051,02-00-00,59 ST,NQR456W,BMT,08:00:00,REGULAR,6736105,2283229,True
2018-08-25,A002,R051,02-00-00,59 ST,NQR456W,BMT,12:00:00,REGULAR,6736180,2283314,True
2018-08-25,A002,R051,02-00-00,59 ST,NQR456W,BMT,16:00:00,REGULAR,6736349,2283384,True
