### Data Cleaning

It's commonly said that data scientists spend 80% of their time cleaning and manipulating data and only 20% of their time analyzing it. The time spent cleaning is vital since analyzing dirty data can lead you to draw inaccurate conclusions. Data cleaning is an essential task in data science. Without properly cleaned data, the results of any data analysis or machine learning model could be inaccurate. In this course, you will learn how to identify, diagnose, and treat a variety of data cleaning problems in Python, ranging from simple to advanced. You will deal with improper data types, check that your data is in the correct range, handle missing data, perform record linkage, and more!

### 1. Common data problems 

- Inconsistent column names
- Missing Data
- Outliers
- Duplicate rows
- Untidiness

In [1]:
#Import libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import random
from random import randint

import extra # Just a file containing useful lists

In [2]:
ride_sharing = pd.read_csv('../Datasets/ride_sharing_new.csv')
ride_sharing.head()

Unnamed: 0.1,Unnamed: 0,duration,station_A_id,station_A_name,station_B_id,station_B_name,bike_id,user_type,user_birth_year,user_gender
0,0,12 minutes,81,Berry St at 4th St,323,Broadway at Kearny,5480,2,1959,Male
1,1,24 minutes,3,Powell St BART Station (Market St at 4th St),118,Eureka Valley Recreation Center,5193,2,1965,Male
2,2,8 minutes,67,San Francisco Caltrain Station 2 (Townsend St...,23,The Embarcadero at Steuart St,3652,3,1993,Male
3,3,4 minutes,16,Steuart St at Market St,28,The Embarcadero at Bryant St,1883,1,1979,Male
4,4,11 minutes,22,Howard St at Beale St,350,8th St at Brannan St,4626,2,1994,Male


### Numeric data or ... ?
You'll be working with bicycle ride sharing data in San Francisco called ride_sharing. It contains information on the start and end stations, the trip duration, and some user information for a bike sharing service.

The user_type column contains information on whether a user is taking a free ride and takes on the following values:

1 for free riders.

2 for pay per ride.

3 for monthly subscribers.

In this instance, you will print the information of ride_sharing using .info() and see a firsthand example of how an incorrect data type can flaw your analysis of the dataset. The pandas package is imported as pd.

In [3]:
# Print the information of ride_sharing
print(ride_sharing.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25760 entries, 0 to 25759
Data columns (total 10 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   Unnamed: 0       25760 non-null  int64 
 1   duration         25760 non-null  object
 2   station_A_id     25760 non-null  int64 
 3   station_A_name   25760 non-null  object
 4   station_B_id     25760 non-null  int64 
 5   station_B_name   25760 non-null  object
 6   bike_id          25760 non-null  int64 
 7   user_type        25760 non-null  int64 
 8   user_birth_year  25760 non-null  int64 
 9   user_gender      25760 non-null  object
dtypes: int64(6), object(4)
memory usage: 2.0+ MB
None


In [4]:
# Print summary statistics of user_type column
print(ride_sharing['user_type'].describe())

count    25760.000000
mean         2.008385
std          0.704541
min          1.000000
25%          2.000000
50%          2.000000
75%          3.000000
max          3.000000
Name: user_type, dtype: float64


In [5]:
# Convert user_type from integer to category
ride_sharing['user_type_cat'] = ride_sharing['user_type'].astype('category')

In [6]:
# Write an assert statement confirming the change
assert ride_sharing['user_type_cat'].dtype == 'category'


In [7]:
# Print new summary statistics 
print(ride_sharing['user_type_cat'].describe())

count     25760
unique        3
top           2
freq      12972
Name: user_type_cat, dtype: int64


### Summing strings and concatenating numbers
In the previous exercise, you were able to identify that category is the correct data type for user_type and convert it in order to extract relevant statistical summaries that shed light on the distribution of user_type.

Another common data type problem is importing what should be numerical values as strings, as mathematical operations such as summing and multiplication lead to string concatenation, not numerical outputs.

In this exercise, you'll be converting the string column duration to the type int. Before that however, you will need to make sure to strip "minutes" from the column in order to make sure pandas reads it as numerical.

In [8]:
# Strip duration of minutes
ride_sharing['duration_trim'] = ride_sharing['duration'].str.strip('minutes')

# Convert duration to integer
ride_sharing['duration_time'] = ride_sharing['duration_trim'].astype('int')

# Write an assert statement making sure of conversion
assert ride_sharing['duration_time'].dtype == 'int'

# Print formed columns and calculate average ride duration 
print(ride_sharing[['duration','duration_trim','duration_time']])
print('Average ride sharing duration time is {:.2f}'.format(ride_sharing['duration_time'].mean()))

         duration duration_trim  duration_time
0      12 minutes           12              12
1      24 minutes           24              24
2       8 minutes            8               8
3       4 minutes            4               4
4      11 minutes           11              11
...           ...           ...            ...
25755  11 minutes           11              11
25756  10 minutes           10              10
25757  14 minutes           14              14
25758  14 minutes           14              14
25759  29 minutes           29              29

[25760 rows x 3 columns]
Average ride sharing duration time is 11.39


In [9]:
#Trying to create random tire sizes for each bike in the dataset
tire_sizes = []
for s in range(0, 25760):
    n = random.randint(26, 29)
    tire_sizes.append(n)
    
#Creating a tire sizez column in the dataset
ride_sharing['tire_sizes'] = tire_sizes

In [10]:
ride_sharing.head()

Unnamed: 0.1,Unnamed: 0,duration,station_A_id,station_A_name,station_B_id,station_B_name,bike_id,user_type,user_birth_year,user_gender,user_type_cat,duration_trim,duration_time,tire_sizes
0,0,12 minutes,81,Berry St at 4th St,323,Broadway at Kearny,5480,2,1959,Male,2,12,12,26
1,1,24 minutes,3,Powell St BART Station (Market St at 4th St),118,Eureka Valley Recreation Center,5193,2,1965,Male,2,24,24,26
2,2,8 minutes,67,San Francisco Caltrain Station 2 (Townsend St...,23,The Embarcadero at Steuart St,3652,3,1993,Male,3,8,8,29
3,3,4 minutes,16,Steuart St at Market St,28,The Embarcadero at Bryant St,1883,1,1979,Male,1,4,4,29
4,4,11 minutes,22,Howard St at Beale St,350,8th St at Brannan St,4626,2,1994,Male,2,11,11,26


In [11]:
ride_sharing.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25760 entries, 0 to 25759
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype   
---  ------           --------------  -----   
 0   Unnamed: 0       25760 non-null  int64   
 1   duration         25760 non-null  object  
 2   station_A_id     25760 non-null  int64   
 3   station_A_name   25760 non-null  object  
 4   station_B_id     25760 non-null  int64   
 5   station_B_name   25760 non-null  object  
 6   bike_id          25760 non-null  int64   
 7   user_type        25760 non-null  int64   
 8   user_birth_year  25760 non-null  int64   
 9   user_gender      25760 non-null  object  
 10  user_type_cat    25760 non-null  category
 11  duration_trim    25760 non-null  object  
 12  duration_time    25760 non-null  int64   
 13  tire_sizes       25760 non-null  int64   
dtypes: category(1), int64(8), object(5)
memory usage: 2.6+ MB


In [12]:
#Changing the datatype of tire sizes from integer to category
ride_sharing['tire_sizes'] = ride_sharing['tire_sizes'].astype('category')
assert ride_sharing['tire_sizes'].dtype == 'category'

In [13]:
#Checking if the data type change really worked
assert ride_sharing['tire_sizes'].dtype == 'category'

### Tire size constraints
In this lesson, you're going to build on top of the work you've been doing with the ride_sharing DataFrame. You'll be working with the tire_sizes column which contains data on each bike's tire size.

Bicycle tire sizes could be either 26″, 27″ or 29″ and are here correctly stored as a categorical value. In an effort to cut maintenance costs, the ride sharing provider decided to set the maximum tire size to be 27″.

In this exercise, you will make sure the tire_sizes column has the correct range by first converting it to an integer, then setting and testing the new upper limit of 27″ for tire sizes.

In [14]:
# Convert tire_sizes to integer
ride_sharing['tire_sizes'] = ride_sharing['tire_sizes'].astype('int')

# Set all values above 27 to 27
ride_sharing.loc[ride_sharing['tire_sizes'] > 27, 'tire_sizes'] = 27
ride_sharing[ride_sharing['tire_sizes'] > 27]

# Reconvert tire_sizes back to categorical
ride_sharing['tire_sizes'] = ride_sharing['tire_sizes'].astype('category')

# Print tire size description
print(ride_sharing['tire_sizes'].describe())

count     25760
unique        2
top          27
freq      19430
Name: tire_sizes, dtype: int64


In [15]:
ride_sharing.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25760 entries, 0 to 25759
Data columns (total 14 columns):
 #   Column           Non-Null Count  Dtype   
---  ------           --------------  -----   
 0   Unnamed: 0       25760 non-null  int64   
 1   duration         25760 non-null  object  
 2   station_A_id     25760 non-null  int64   
 3   station_A_name   25760 non-null  object  
 4   station_B_id     25760 non-null  int64   
 5   station_B_name   25760 non-null  object  
 6   bike_id          25760 non-null  int64   
 7   user_type        25760 non-null  int64   
 8   user_birth_year  25760 non-null  int64   
 9   user_gender      25760 non-null  object  
 10  user_type_cat    25760 non-null  category
 11  duration_trim    25760 non-null  object  
 12  duration_time    25760 non-null  int64   
 13  tire_sizes       25760 non-null  category
dtypes: category(2), int64(7), object(5)
memory usage: 2.4+ MB


In [16]:
# I want to add a date column to the dataframe. will do that since its necessary for the next exercise
import random
from datetime import datetime, timedelta

min_year = 2017
max_year = datetime.now().year

start = datetime(min_year, 1, 1, 00, 00, 00)
years = max_year - min_year + 2
end = start + timedelta(days=365 * years)

for i in range(25760):
    random_date = start + (end - start) * random.random()
    print(random_date)

2022-10-27 17:46:50.247246
2022-02-04 04:14:35.268019
2022-04-16 09:14:57.974146
2018-01-12 01:58:17.483887
2021-12-25 16:42:00.887744
2021-07-22 23:43:45.018207
2020-01-13 19:31:14.962350
2017-08-29 20:01:43.805302
2020-07-13 13:55:45.239781
2017-07-27 09:06:02.138842
2020-11-03 00:14:50.238142
2019-08-06 03:29:35.827940
2018-07-04 01:40:32.029271
2017-08-09 22:50:08.342681
2022-08-06 12:53:19.004828
2018-12-22 16:45:41.820563
2017-05-12 12:39:48.152069
2021-04-29 06:20:39.543825
2020-01-27 08:17:40.569823
2017-11-24 09:13:58.666539
2017-02-18 01:57:00.417451
2021-08-22 12:34:40.778184
2017-11-27 23:42:18.259741
2017-10-26 03:58:56.585799
2017-09-07 07:56:48.771845
2019-11-16 10:43:11.352438
2022-12-26 23:58:56.481440
2019-07-31 08:19:57.263297
2018-04-13 04:28:35.835421
2022-11-09 23:37:37.729261
2019-04-24 19:05:58.796990
2018-07-01 23:43:00.547355
2017-07-10 05:52:09.834406
2021-03-23 01:52:10.803024
2020-08-22 06:22:31.786085
2017-08-30 23:52:58.170707
2022-03-14 15:21:28.892187
2

2018-01-17 21:33:30.501034
2021-04-15 04:51:20.705994
2022-12-06 09:45:09.671344
2018-05-26 23:50:57.675475
2021-08-28 10:59:07.830514
2020-06-24 01:11:44.043801
2017-04-09 11:27:57.033568
2020-08-29 20:19:55.426860
2018-07-09 14:08:46.624056
2020-07-25 16:39:36.404495
2022-06-25 23:37:27.332224
2021-07-30 16:49:25.209076
2019-10-26 12:10:09.310614
2019-04-02 08:16:07.953883
2019-09-10 18:57:54.150761
2020-06-18 03:00:42.373832
2022-09-01 18:03:27.567466
2019-09-23 11:31:59.163197
2022-07-14 02:55:13.381224
2018-04-26 15:42:47.829817
2018-03-30 08:06:30.097715
2021-11-15 02:45:11.323789
2022-09-14 10:35:51.625525
2020-09-10 20:42:32.685526
2021-05-10 11:42:27.873263
2018-07-23 20:51:23.590159
2018-02-14 20:24:03.316221
2020-06-26 01:26:31.862281
2017-07-14 05:56:33.806959
2021-11-09 01:11:27.026640
2020-02-22 22:31:49.437225
2022-09-26 23:54:15.992257
2018-08-16 12:58:42.214318
2019-03-13 05:52:34.168535
2022-08-12 18:29:42.133035
2021-12-20 16:08:06.847903
2019-09-01 23:55:53.387228
2

2017-12-16 20:34:48.797684
2019-08-18 09:24:42.652405
2021-10-11 04:53:15.476695
2019-09-19 14:55:02.169358
2017-12-15 04:13:40.213439
2019-07-20 20:02:11.143645
2020-10-29 10:09:27.639964
2022-05-22 19:34:01.767990
2021-02-01 13:45:34.181927
2022-06-21 22:46:46.687664
2022-05-19 10:14:25.546210
2021-05-21 17:01:19.362710
2019-09-20 05:39:13.664856
2017-11-22 06:58:11.899641
2020-08-31 17:28:26.076107
2017-09-18 01:41:06.404805
2019-03-01 06:41:24.842578
2022-05-20 20:37:27.277149
2019-08-24 20:57:04.576152
2021-08-04 12:26:49.466750
2019-02-18 21:33:06.823412
2019-04-28 19:05:11.428846
2018-03-21 11:21:02.366288
2022-12-13 06:11:47.205779
2019-07-26 04:08:37.350436
2022-04-21 21:50:35.984754
2021-02-18 08:25:34.151701
2020-10-31 21:44:40.504472
2020-08-06 07:14:51.445371
2019-02-28 22:34:18.540954
2018-03-26 19:55:09.566426
2022-05-10 13:58:11.133994
2019-08-26 13:24:41.990634
2017-08-29 00:09:16.755701
2020-10-31 23:54:03.509030
2021-08-19 08:48:10.797784
2020-12-30 16:59:30.824165
2

2021-07-27 05:36:17.032740
2017-09-17 09:25:51.667251
2017-05-07 01:45:02.550353
2019-08-31 05:09:01.735430
2020-03-10 21:50:33.235751
2020-05-11 07:51:18.741581
2018-05-17 03:04:42.133648
2017-11-22 17:06:58.024079
2021-11-12 02:51:53.927080
2022-03-12 06:49:00.570255
2019-12-19 02:40:28.456921
2019-04-20 00:08:56.105480
2020-03-22 21:06:26.480232
2020-08-03 17:34:27.259289
2019-03-11 14:45:13.722515
2022-10-16 15:36:16.886945
2019-08-24 11:41:15.913530
2017-12-26 09:48:10.898225
2021-03-02 22:33:04.436171
2017-12-20 18:09:01.817300
2021-03-22 13:35:09.794384
2020-04-23 11:24:25.578240
2019-12-09 11:12:37.317123
2021-04-17 23:22:54.511244
2022-06-16 18:32:02.945843
2017-08-26 10:30:03.304664
2017-11-20 07:53:44.769410
2020-02-03 14:36:07.648127
2021-06-19 01:29:32.194990
2021-10-10 10:32:50.692151
2021-11-09 07:20:47.731042
2020-11-16 07:13:49.367474
2018-05-16 17:30:47.628608
2018-04-15 20:11:42.783770
2018-12-14 08:59:23.478324
2021-10-29 00:25:06.315905
2022-03-23 11:20:03.187258
2

2021-04-15 00:12:55.391979
2021-02-19 03:01:03.952590
2021-01-04 22:38:51.680551
2021-11-10 12:33:37.753493
2021-12-04 07:25:33.999469
2021-01-12 08:24:03.160594
2022-06-30 13:40:31.552365
2018-10-25 04:09:23.898718
2020-09-06 06:05:03.052407
2018-11-13 22:39:59.034165
2021-05-22 17:19:24.198495
2017-04-22 13:41:59.885281
2019-07-01 06:00:48.262607
2020-12-14 02:22:15.004958
2017-11-03 02:04:53.878729
2020-09-03 11:31:56.011780
2017-04-08 09:45:45.128143
2018-12-29 01:52:44.938673
2020-11-24 21:18:21.067139
2020-07-01 13:56:20.413153
2018-08-20 00:25:28.797523
2019-07-13 21:31:54.123726
2018-02-28 13:28:31.028162
2018-11-06 17:03:11.834816
2018-11-01 13:22:43.393545
2022-01-03 15:18:24.522749
2020-08-31 22:14:54.645085
2017-07-16 04:56:37.498711
2017-12-16 21:05:08.674580
2017-07-15 17:01:45.380895
2020-05-24 13:31:06.597831
2020-04-10 21:54:08.289981
2020-10-14 06:22:36.979736
2018-03-14 17:36:13.878115
2018-08-09 23:18:47.224813
2017-08-25 00:24:40.895861
2018-10-26 09:21:22.994094
2

2017-04-22 17:28:00.489625
2022-11-14 22:38:20.067618
2018-12-17 21:04:07.164022
2021-07-01 17:30:54.780397
2020-03-09 23:26:51.792792
2018-01-23 05:28:42.312401
2021-10-08 04:51:16.725820
2018-05-20 15:15:09.681748
2022-06-17 06:47:55.005265
2018-07-10 21:21:48.588792
2018-11-13 05:32:36.386280
2019-11-10 06:40:52.776595
2017-11-24 00:59:09.210139
2019-01-23 07:29:21.005264
2022-09-30 10:48:04.301420
2022-01-28 03:16:52.026304
2022-09-22 21:48:00.822850
2022-05-02 01:56:07.426284
2017-01-22 18:26:27.107043
2021-04-11 23:52:19.623330
2020-04-06 16:23:13.016940
2017-12-07 23:41:21.382697
2018-11-07 17:26:26.748680
2017-03-31 13:33:42.263282
2021-01-19 10:46:13.917235
2018-01-30 15:14:50.339837
2021-06-12 07:14:17.329356
2020-07-22 21:51:39.094184
2021-11-05 06:44:11.005334
2017-01-07 21:47:01.088696
2021-03-16 10:24:46.163281
2022-12-01 17:12:02.435479
2019-08-11 15:54:59.615249
2017-03-23 09:27:59.995175
2022-04-14 16:20:35.519145
2021-09-07 07:18:45.086935
2020-09-22 11:38:49.734745
2

2018-10-16 02:30:00.181637
2018-08-31 13:17:02.609319
2017-10-27 19:27:09.845001
2017-09-01 07:55:59.722453
2020-03-22 08:42:53.758666
2020-08-17 08:31:28.114683
2022-02-16 15:06:16.399555
2020-09-23 19:34:25.486632
2017-05-16 12:59:51.931450
2020-05-08 06:51:29.520015
2018-08-17 20:24:06.886406
2022-03-14 23:26:08.225238
2019-03-13 14:55:54.821210
2020-04-01 03:05:30.485619
2022-11-09 21:05:10.245775
2022-03-15 21:29:46.684639
2021-10-27 18:38:13.081607
2019-06-27 08:18:12.367257
2019-12-04 17:24:02.033390
2018-07-30 23:51:32.394599
2018-07-11 21:01:45.419126
2022-05-04 18:53:33.308473
2017-09-13 20:33:17.328758
2018-09-06 04:57:08.678904
2020-07-02 20:28:01.488121
2021-10-29 13:10:41.788756
2020-09-29 18:15:05.154618
2019-02-06 21:30:58.171294
2017-12-14 15:39:35.539863
2019-08-30 22:27:00.610125
2018-09-29 21:51:53.784709
2021-09-15 23:32:13.613809
2021-06-28 05:00:35.298141
2022-05-08 21:24:51.251415
2020-03-07 09:36:56.027795
2019-10-18 22:09:33.392273
2022-11-22 20:41:21.145911
2

2018-04-15 10:02:01.365274
2019-10-26 10:03:01.800123
2019-12-21 13:25:16.520774
2022-12-11 08:38:42.180632
2020-07-21 17:32:07.533631
2019-01-06 08:21:20.781464
2019-07-11 20:31:50.527102
2019-09-22 05:43:23.352365
2020-11-29 06:31:56.129202
2020-04-03 21:31:28.943913
2018-04-22 01:56:08.894096
2021-12-06 03:25:06.439470
2022-03-03 22:25:02.401341
2018-08-01 06:35:06.277597
2018-03-07 20:42:12.959219
2022-01-16 19:01:35.669341
2022-03-18 21:35:43.777652
2017-04-18 09:34:59.872225
2018-05-27 09:40:44.727814
2017-07-20 00:29:49.881816
2022-07-04 10:36:40.216928
2019-01-07 06:02:54.986423
2019-09-24 04:19:34.668171
2022-10-29 22:06:59.313703
2022-01-09 14:19:36.866031
2022-12-02 17:07:31.075956
2019-07-10 06:04:41.413455
2018-06-05 08:43:01.799809
2017-01-14 13:30:22.959113
2022-04-04 15:26:00.005324
2017-02-08 04:39:08.748551
2020-10-02 09:14:50.965706
2021-08-20 16:49:09.662204
2020-12-28 12:05:07.106874
2021-08-19 09:05:16.161343
2019-10-14 17:12:58.500737
2022-06-25 08:11:50.455500
2

2020-07-15 00:51:08.152632
2021-04-10 08:48:28.353891
2020-09-18 07:37:59.410745
2017-08-24 08:00:36.406663
2018-06-23 21:52:47.336924
2022-06-12 16:59:01.848098
2020-10-30 10:49:18.729860
2017-10-12 12:11:09.317234
2021-09-30 18:55:23.098721
2019-09-17 09:46:43.874133
2022-01-25 11:09:14.339896
2019-10-29 18:39:18.466154
2021-01-26 02:13:40.921985
2021-11-12 21:46:34.695176
2022-07-13 16:11:54.346414
2018-07-17 00:03:46.021612
2019-08-21 01:20:32.758982
2020-06-30 16:38:59.147245
2018-07-02 11:17:11.495304
2022-10-03 01:45:43.986753
2019-08-07 11:24:31.912045
2017-03-03 02:21:21.135144
2021-07-04 12:57:46.138776
2018-07-25 06:03:54.651075
2019-05-01 23:07:12.041833
2021-10-08 23:45:27.349315
2021-11-19 23:33:40.376833
2021-08-24 00:26:11.644328
2018-02-13 01:00:28.115512
2018-12-25 09:47:05.634728
2017-07-26 04:56:09.842965
2017-03-16 05:37:56.925816
2018-09-09 13:27:23.724749
2022-07-26 01:15:34.873836
2020-03-29 06:32:53.953270
2019-09-26 00:39:40.267630
2017-07-09 22:38:03.028207
2

2022-01-30 11:38:30.587829
2022-07-21 13:40:51.167365
2020-02-18 01:01:53.090453
2020-02-05 19:56:43.163289
2020-06-25 06:02:48.512576
2020-01-23 11:25:11.152381
2020-08-24 19:36:21.506245
2018-06-18 13:21:33.811131
2021-12-25 22:04:19.671683
2018-12-07 11:18:49.580549
2019-07-12 09:42:33.724144
2022-09-03 16:11:23.924559
2019-08-29 23:27:04.321341
2019-05-09 12:13:10.391144
2018-09-08 10:41:34.445361
2020-02-13 04:49:51.645959
2018-10-02 14:45:25.619090
2021-09-10 17:50:30.227304
2017-08-20 13:52:14.778487
2020-06-24 16:50:25.898589
2020-09-27 10:24:50.604763
2021-12-06 02:49:53.904066
2017-03-19 15:23:13.454627
2020-08-18 01:14:08.706810
2017-01-26 00:18:17.058576
2021-06-02 20:34:54.815003
2017-04-29 02:03:53.087582
2017-03-17 06:39:22.685727
2020-03-27 07:35:33.221254
2018-11-09 07:04:36.440352
2017-06-10 18:30:37.438262
2020-06-16 18:57:25.334639
2017-12-05 07:22:15.020930
2017-07-20 19:16:52.168158
2017-09-26 17:11:38.112080
2021-02-13 06:47:18.360526
2019-07-01 04:13:29.554199
2

2019-06-05 02:08:48.408634
2017-01-18 07:06:57.461510
2020-12-18 07:59:38.820022
2022-12-28 07:48:44.601118
2017-03-11 19:42:02.436550
2017-12-28 21:20:29.904966
2022-07-22 01:18:01.375694
2017-07-05 04:21:11.845569
2019-11-02 17:33:43.708127
2021-11-23 19:30:18.702133
2020-11-10 09:06:27.820637
2017-03-29 21:14:10.855845
2017-10-14 06:06:45.695726
2019-09-22 22:43:26.405101
2017-07-30 11:37:04.314193
2022-12-09 07:22:27.415714
2022-07-21 04:00:33.020705
2022-01-15 17:07:25.760303
2021-11-29 01:46:40.919563
2018-11-29 10:28:22.843982
2017-05-04 17:57:33.174289
2017-10-30 09:06:46.506369
2022-08-30 06:46:17.489595
2022-02-02 03:52:13.094821
2020-05-03 17:18:49.625091
2018-10-02 07:29:41.847810
2017-12-20 16:35:15.509835
2018-08-19 09:20:10.287132
2017-07-21 01:10:11.029370
2020-05-19 23:14:23.976373
2021-07-20 08:33:33.073123
2021-05-15 09:50:38.686625
2019-05-18 01:59:09.828517
2021-03-10 03:21:47.247843
2019-02-14 11:32:45.644501
2018-03-22 07:29:59.548368
2021-08-26 08:01:56.532957
2

2022-03-19 11:16:47.939742
2017-03-23 23:27:26.949070
2019-10-20 19:44:13.142220
2019-08-21 05:54:00.769062
2021-08-31 05:24:51.925943
2020-04-26 19:55:25.706966
2021-01-21 21:40:40.510180
2021-07-29 01:26:34.555061
2021-06-27 15:19:11.306800
2020-04-13 21:17:16.143101
2022-11-08 22:36:55.747384
2017-07-20 04:18:30.078434
2017-03-21 20:32:18.134550
2022-02-05 04:16:46.390565
2022-01-11 07:18:56.120576
2018-12-02 12:12:51.237006
2020-08-26 04:24:02.177045
2021-12-08 13:57:38.671054
2021-01-03 04:54:28.194759
2017-06-26 13:55:01.219433
2017-08-19 05:04:32.095849
2017-08-08 04:31:55.719194
2017-11-23 09:34:58.010604
2018-12-20 17:38:55.639214
2021-03-14 10:58:51.832315
2021-01-27 03:30:38.455537
2020-10-02 09:08:43.913042
2021-06-10 23:48:09.625212
2020-12-15 00:04:14.985805
2021-09-04 07:46:49.394851
2017-08-02 09:49:17.491014
2020-02-02 08:48:55.499240
2019-05-13 01:03:04.880932
2020-01-18 23:23:46.610238
2022-11-24 15:10:01.382332
2018-11-25 09:27:49.251572
2020-08-16 23:48:59.072733
2

2020-07-03 06:06:44.064407
2019-08-25 23:09:13.041493
2019-04-20 17:36:02.051186
2018-07-04 23:19:45.520173
2018-04-09 04:15:45.310678
2018-01-03 18:07:49.168043
2017-01-24 21:31:03.791665
2022-03-31 18:49:51.786247
2019-04-21 07:11:32.206704
2019-10-02 02:54:46.343867
2021-07-27 07:10:03.745212
2019-11-21 11:59:02.999778
2020-01-04 10:39:53.645798
2017-06-24 09:46:00.828290
2022-05-28 01:46:50.443356
2017-12-21 10:27:21.739542
2020-10-13 17:22:58.663427
2020-10-08 13:34:33.425233
2017-01-04 13:16:40.798745
2018-09-29 19:07:56.941545
2019-07-11 22:17:38.754620
2021-01-23 06:57:37.394364
2020-12-11 10:48:35.524668
2019-03-11 23:24:36.255642
2020-11-06 06:56:58.979065
2017-04-03 16:44:33.255906
2018-06-21 01:43:36.970004
2018-02-10 07:31:24.000888
2022-01-21 13:34:18.261612
2022-04-20 06:55:13.111064
2022-05-24 12:21:51.661304
2019-01-31 22:48:24.378080
2020-01-10 11:06:36.943757
2018-09-28 08:51:00.462427
2021-09-28 05:09:44.065051
2020-04-29 21:54:04.135075
2019-07-01 12:44:58.953299
2

2021-02-23 07:29:27.733851
2021-10-02 18:41:19.929006
2021-06-30 06:44:48.271407
2021-11-09 02:59:17.300297
2020-01-25 08:31:21.706386
2020-10-25 14:52:33.407034
2022-07-22 02:59:22.287609
2022-12-10 12:50:48.921657
2021-05-28 00:00:04.833676
2022-06-08 04:47:27.267305
2021-09-08 17:45:37.885131
2020-11-19 02:23:12.849635
2020-02-14 04:33:38.965538
2018-09-07 22:26:32.610748
2022-10-06 17:19:28.638010
2020-10-02 11:56:04.566406
2021-10-08 12:47:33.638802
2020-04-22 14:14:27.568734
2022-12-24 05:54:30.818340
2021-06-12 04:35:16.720074
2020-01-05 09:40:06.659742
2022-06-28 18:28:44.210603
2017-11-21 00:02:47.524858
2019-08-04 03:06:08.386549
2019-08-15 20:47:29.780105
2019-10-25 06:59:06.805784
2018-05-15 06:04:39.155107
2022-05-07 14:04:12.901448
2022-03-16 22:21:09.351889
2022-07-07 13:31:07.395759
2022-09-19 09:26:52.920210
2021-08-31 04:35:18.512042
2018-04-30 16:51:11.071320
2022-03-29 14:00:52.779540
2021-01-31 22:53:00.163066
2017-02-01 00:53:18.376648
2022-11-13 20:23:21.723920
2

2018-12-12 16:56:56.752515
2020-09-19 10:04:27.992436
2019-05-09 07:47:22.020632
2019-11-28 02:43:13.796000
2019-09-12 12:12:00.119698
2019-03-24 10:45:53.551666
2017-07-14 12:22:28.571179
2021-11-16 03:34:45.228047
2019-03-17 15:23:03.551115
2017-11-23 08:46:47.640332
2022-11-29 02:38:05.480557
2021-08-15 01:47:52.670871
2021-02-01 03:48:28.803814
2018-09-17 03:42:54.829307
2020-09-27 17:55:07.500199
2019-07-21 12:28:50.487549
2020-02-18 16:39:23.473166
2018-12-25 17:07:11.323974
2022-09-01 09:43:43.290279
2022-11-21 09:13:48.262065
2021-05-21 18:54:46.762855
2022-01-23 20:51:57.040467
2021-07-28 18:22:23.681438
2021-06-16 04:24:30.464229
2018-05-17 09:22:35.822274
2022-02-23 15:46:09.533135
2017-03-18 17:46:19.196763
2019-10-27 08:27:22.694592
2020-04-05 12:00:52.535155
2021-02-21 01:50:21.524491
2020-07-02 19:23:53.776007
2021-04-15 08:26:59.320769
2022-07-20 20:02:45.938219
2022-06-25 09:45:26.817108
2017-10-10 10:47:44.288631
2018-12-23 21:06:30.965240
2018-02-01 08:57:58.621575
2

2022-01-07 03:42:01.500835
2019-11-20 23:35:38.186503
2020-01-27 15:45:55.497123
2019-01-19 07:59:46.309696
2017-09-27 06:46:02.671561
2022-07-08 06:45:07.863269
2021-05-01 19:10:25.577563
2022-08-18 23:42:40.943819
2017-10-10 13:20:24.151752
2017-12-06 15:12:14.026834
2022-03-27 13:53:11.906066
2021-12-05 01:57:30.945009
2021-02-05 23:41:25.570082
2018-11-20 06:09:52.691971
2018-05-28 11:59:18.482420
2019-04-12 14:55:40.943498
2019-05-21 08:32:02.954577
2017-06-29 13:27:33.824468
2017-04-17 01:17:49.147873
2020-09-23 21:06:53.387905
2020-10-29 22:40:13.080495
2018-08-09 11:55:32.367768
2018-01-19 21:04:39.003064
2018-06-22 19:38:43.552437
2018-11-01 00:42:19.698493
2017-09-15 22:23:08.051970
2022-06-04 21:55:31.909981
2018-05-23 17:49:35.826614
2017-07-27 13:12:23.106851
2022-12-04 16:21:12.992117
2017-06-12 19:29:17.595934
2020-02-05 07:24:40.522865
2018-01-21 05:11:57.534587
2017-09-27 00:47:06.405765
2017-05-30 18:45:17.791421
2020-09-28 10:27:34.472103
2017-06-26 09:13:19.324952
2

2017-07-23 17:02:25.953996
2017-01-18 12:01:09.670691
2019-10-30 11:48:23.246553
2021-11-07 00:11:38.209151
2022-11-20 05:56:56.288106
2020-01-10 21:15:19.897556
2021-07-22 16:59:37.147297
2017-01-28 08:59:26.699295
2021-05-29 10:07:36.312561
2017-03-11 08:41:44.098487
2022-09-12 22:56:00.903044
2018-11-08 10:06:50.291611
2021-05-22 10:15:14.702639
2019-09-14 11:21:29.680021
2017-02-05 19:01:54.462355
2022-02-09 04:55:27.180684
2021-03-10 05:44:51.337155
2020-11-25 06:41:42.574468
2017-06-23 15:58:27.622919
2017-09-02 19:34:40.532233
2018-01-19 06:43:40.755365
2022-01-28 23:14:14.004443
2018-06-07 19:43:20.130382
2020-07-23 05:05:12.517523
2019-05-13 13:57:11.750575
2020-10-02 11:20:24.835193
2017-10-01 07:04:42.877609
2019-10-25 15:10:04.126278
2022-01-14 21:27:35.680338
2022-09-14 00:49:05.347475
2019-10-05 20:25:19.380933
2017-02-06 21:01:39.151354
2020-08-17 21:25:02.490553
2019-07-03 13:32:36.643879
2018-05-11 23:49:57.144616
2021-11-19 07:08:34.837882
2022-02-05 01:39:58.397328
2

2021-12-20 14:49:31.147968
2021-10-11 05:37:55.263984
2019-11-28 19:31:51.550155
2022-09-05 00:28:50.583159
2017-07-28 12:17:46.145806
2021-01-03 08:54:48.785019
2020-04-23 05:51:26.195324
2020-10-27 12:03:15.520947
2018-11-22 05:13:10.690013
2022-04-08 03:00:21.815128
2018-08-11 05:45:37.661642
2022-09-26 22:56:59.029052
2018-05-18 05:01:57.946405
2020-08-21 10:54:33.627004
2022-08-30 14:14:37.659080
2019-04-15 23:51:20.657688
2019-11-27 13:03:43.196511
2017-11-27 05:01:19.238025
2022-04-15 21:09:08.529496
2018-08-22 08:44:03.836111
2019-09-05 20:33:44.679125
2017-08-05 02:09:00.461425
2017-04-11 08:16:50.577507
2020-01-15 00:41:47.829415
2021-10-06 16:08:07.044759
2021-11-23 00:15:24.630626
2022-05-02 04:29:55.107147
2019-09-16 06:09:08.512398
2020-12-11 04:15:27.360327
2022-10-23 07:42:58.726297
2019-04-04 15:09:14.271447
2019-08-18 14:25:36.927306
2021-01-29 00:26:50.917415
2018-08-15 13:06:55.951393
2021-09-30 12:41:19.597567
2021-10-08 21:58:56.038767
2018-01-01 08:43:12.860853
2

2020-12-07 21:15:31.143653
2022-10-21 04:51:16.094881
2020-08-24 08:32:14.490599
2022-02-18 15:26:15.885249
2020-10-25 05:43:23.740158
2017-02-02 09:22:56.844491
2018-07-10 23:19:24.271333
2019-11-03 12:00:32.789830
2018-01-19 21:55:14.843722
2017-03-01 04:10:45.912664
2020-07-28 21:04:46.682284
2020-01-23 06:14:29.591799
2021-11-07 15:04:44.905737
2019-02-19 10:50:25.757022
2019-02-13 17:59:57.625086
2021-03-31 12:49:36.605966
2017-01-23 23:28:46.351764
2022-05-04 11:31:55.850113
2017-09-20 13:36:43.048214
2022-11-03 03:34:31.280089
2017-06-15 18:06:18.310411
2018-09-02 10:38:12.047091
2020-11-29 15:53:07.620010
2017-09-16 05:10:23.601659
2017-09-13 16:03:08.848222
2019-12-27 17:18:04.708165
2020-05-09 10:38:47.518135
2019-04-24 04:13:21.090929
2020-08-06 01:32:40.672697
2022-08-20 01:52:38.992284
2021-03-19 19:59:42.909628
2019-04-17 16:27:53.076661
2021-08-11 10:33:21.885816
2017-05-31 02:20:33.602516
2020-03-02 04:11:17.614492
2017-03-08 21:48:56.821602
2021-10-06 04:35:38.800187
2

2022-06-18 09:32:12.598915
2018-03-31 15:48:30.555248
2019-10-17 08:18:27.469135
2017-08-29 22:57:16.993252
2017-05-03 09:51:00.075148
2019-01-30 23:49:10.736478
2021-03-05 00:39:02.669148
2017-03-01 18:41:30.928435
2018-04-18 05:49:22.015543
2021-07-06 19:48:38.945673
2022-06-29 23:39:48.392516
2017-02-16 06:29:31.825051
2020-01-08 17:35:21.200614
2021-11-06 20:18:03.276901
2018-01-09 09:01:47.121419
2021-05-24 03:13:40.375862
2019-04-23 22:15:45.646270
2017-04-11 20:30:06.515784
2017-03-23 12:06:01.817973
2021-03-11 07:50:49.450065
2022-08-30 17:22:25.506563
2022-01-08 12:25:29.294740
2019-05-27 03:24:33.848159
2019-11-11 02:48:39.934432
2019-02-07 17:48:30.296284
2019-05-13 22:38:25.660457
2019-02-21 23:25:04.378574
2022-02-17 23:19:55.115901
2020-09-07 14:33:10.727076
2017-12-25 07:32:25.113286
2021-08-30 03:24:45.059016
2017-10-18 07:10:53.140504
2018-12-16 02:31:31.224950
2020-05-26 19:53:50.215451
2019-06-16 12:31:58.886649
2021-11-05 18:16:44.103007
2017-01-09 07:52:59.542119
2

2021-05-27 14:38:07.956535
2022-08-15 01:59:27.146274
2019-01-24 05:12:14.879225
2018-11-08 23:24:44.028614
2018-12-23 15:32:45.434090
2018-10-15 09:02:42.536975
2020-12-07 08:38:59.650060
2018-11-20 15:18:41.311930
2021-07-09 13:13:12.404074
2021-01-24 15:17:45.117466
2021-11-23 06:34:30.066686
2018-07-04 09:50:53.750628
2021-10-13 12:34:19.698120
2017-11-26 08:19:34.319805
2021-07-03 01:17:17.678441
2017-06-24 22:51:25.977536
2019-04-20 16:01:16.565989
2017-08-28 05:39:57.641519
2020-12-06 17:26:55.677448
2017-07-26 22:34:03.980894
2021-07-31 17:24:37.143201
2019-09-28 22:24:35.031620
2021-01-02 19:30:58.332532
2019-07-03 01:22:52.152789
2021-01-12 06:39:57.342844
2020-09-30 12:17:02.744820
2017-12-27 03:34:16.959762
2019-12-18 02:36:28.994641
2018-07-27 11:51:02.823965
2020-11-20 19:33:31.309963
2020-01-26 01:58:18.375157
2022-10-05 12:24:38.964154
2019-10-01 21:18:14.999714
2018-08-15 11:13:52.759357
2021-04-13 00:35:52.542196
2018-10-11 12:49:30.068238
2019-11-12 10:26:27.495345
2

2020-09-15 15:01:39.586831
2020-05-25 11:33:03.155535
2018-10-02 16:43:45.506772
2020-12-14 14:48:35.578430
2019-10-10 14:59:53.528399
2019-07-25 07:58:50.880318
2017-12-08 22:21:19.462844
2022-08-05 08:05:23.451826
2017-09-15 17:02:29.299888
2021-03-27 16:44:14.848694
2018-03-07 16:31:28.985962
2019-10-03 11:20:16.571995
2018-12-15 00:21:22.004949
2018-09-28 06:30:12.423725
2021-11-22 20:59:36.415862
2020-09-24 17:50:15.199565
2017-12-20 05:36:29.669475
2017-06-20 04:01:14.489399
2017-01-10 14:21:48.566865
2018-12-10 02:21:37.670920
2019-04-12 18:55:32.727719
2022-11-05 07:22:05.844686
2020-05-29 22:25:22.410316
2017-12-04 10:10:06.450494
2018-06-18 05:20:48.710399
2022-11-02 04:03:57.749237
2022-07-13 05:38:56.765819
2019-06-21 20:48:08.332481
2018-08-11 07:27:31.481446
2020-09-24 16:33:42.366929
2018-06-09 17:06:59.471992
2021-07-11 17:49:13.171044
2020-11-23 01:13:27.683512
2018-02-16 16:08:26.872571
2019-02-14 18:05:58.751077
2020-12-22 07:58:48.198412
2017-09-22 13:05:45.882352
2

2021-06-10 11:24:30.753119
2018-07-03 10:04:29.348822
2017-12-30 19:15:49.028811
2017-05-02 02:32:32.629020
2017-12-24 00:44:38.104074
2019-04-09 14:55:17.220805
2019-07-02 21:21:58.469690
2021-05-22 02:03:02.702874
2022-11-03 16:07:43.170849
2022-10-19 18:36:17.817931
2020-10-05 20:04:18.077349
2017-07-26 14:40:39.100898
2021-09-18 19:21:47.747711
2019-09-21 03:01:19.234705
2021-12-18 22:32:05.370050
2021-11-28 14:41:08.842808
2018-12-12 14:45:23.826555
2019-06-15 10:06:22.430253
2018-12-12 23:25:21.295180
2018-07-05 19:11:59.801149
2018-09-27 10:53:30.544887
2017-07-12 14:58:05.081420
2022-02-17 21:50:19.520414
2022-10-11 11:30:49.794165
2020-05-20 14:23:55.749854
2019-01-27 18:30:39.524067
2020-12-18 10:00:19.751485
2018-02-16 05:28:12.641816
2020-01-28 08:27:39.317075
2022-12-12 09:56:52.448061
2018-09-06 00:21:30.431573
2021-05-02 12:05:41.778718
2018-02-23 11:19:20.095742
2021-07-15 16:22:26.588980
2020-09-30 11:45:42.891602
2018-12-27 14:00:00.675586
2019-09-26 06:40:51.314677
2

2018-08-11 09:28:13.114383
2020-03-09 17:59:22.723181
2020-01-08 02:06:32.704990
2022-10-03 20:54:41.905426
2022-11-02 05:39:08.582410
2017-01-29 07:24:06.474075
2019-04-21 13:23:44.089567
2022-07-09 06:39:56.253786
2022-11-13 09:25:28.694517
2021-12-01 09:18:27.332967
2017-05-18 10:36:08.456324
2019-09-01 10:21:09.995757
2020-01-04 19:49:43.226737
2019-11-29 00:01:54.144749
2019-12-24 21:48:23.039241
2022-12-20 00:29:36.045951
2018-02-08 06:59:00.677778
2021-01-10 03:19:22.901412
2022-12-14 17:49:21.883640
2022-05-27 12:12:20.688493
2022-09-02 12:27:50.432068
2021-06-20 21:23:11.245611
2021-04-14 15:14:29.796976
2018-11-01 22:05:55.242875
2019-09-12 01:13:30.769611
2020-04-07 21:29:24.399137
2020-11-26 00:11:16.727685
2017-09-10 22:04:16.383124
2021-10-02 02:59:22.871272
2017-09-07 07:29:50.752290
2018-07-12 07:06:05.533744
2021-09-08 21:15:22.136950
2021-06-07 20:38:12.909739
2017-05-30 17:16:31.136486
2022-10-12 07:34:29.952376
2017-08-18 17:04:51.039183
2019-03-09 20:40:20.243285
2

2017-01-05 10:11:25.003587
2018-09-22 00:09:00.651675
2022-11-07 13:34:49.089133
2017-12-22 06:31:59.697195
2022-04-08 09:51:38.298646
2022-04-10 16:15:11.790163
2020-06-28 10:57:21.182112
2022-02-06 12:52:17.037081
2018-01-26 03:43:56.413877
2019-05-04 13:17:51.835658
2022-05-13 08:12:04.456160
2017-03-15 15:21:09.963241
2019-06-27 10:08:01.490618
2022-07-16 01:32:35.875164
2017-06-09 04:01:02.812054
2020-05-13 16:32:35.158722
2020-04-21 10:49:50.541397
2017-12-23 09:13:30.817239
2022-10-17 10:41:33.587204
2017-05-09 15:23:21.032931
2022-04-02 22:11:09.534081
2017-05-11 10:55:06.834247
2017-10-12 11:21:26.258886
2022-04-30 10:43:11.358684
2018-02-10 13:00:05.968280
2018-10-07 17:05:18.282076
2020-12-26 04:47:09.624836
2019-11-25 02:59:18.983068
2022-12-01 00:12:23.000809
2022-09-19 23:48:02.172407
2021-01-03 03:03:19.833176
2022-09-19 10:21:42.155556
2019-05-20 19:11:04.680074
2018-07-08 01:36:48.104000
2021-01-29 11:30:10.272222
2022-08-27 00:24:52.763007
2019-08-12 02:33:09.130275
2

2020-01-11 02:31:44.219312
2020-06-19 05:56:52.741983
2021-05-13 07:09:32.551227
2020-01-28 23:57:59.589192
2022-08-28 05:15:19.839127
2021-10-30 12:36:53.909294
2022-02-19 22:10:12.449261
2022-09-23 09:25:53.985703
2020-03-11 07:09:49.189622
2021-10-20 13:50:19.442086
2019-03-06 14:15:46.180910
2021-10-29 08:09:21.542318
2018-05-22 17:03:35.394347
2021-02-06 10:30:23.587622
2018-08-07 02:55:06.059007
2019-01-30 01:06:43.934911
2019-08-24 00:41:44.641003
2017-01-21 16:27:39.758612
2017-10-15 07:51:50.854380
2019-01-28 05:35:52.782669
2019-05-27 06:32:24.152077
2022-11-18 08:56:55.888127
2020-04-25 18:56:13.466891
2021-05-03 06:26:21.720720
2021-08-21 15:18:30.483166
2020-11-05 03:56:20.495150
2018-07-15 04:07:33.630919
2022-01-27 16:35:16.122785
2020-12-06 04:47:14.606858
2022-08-10 22:54:29.847219
2019-11-27 18:12:20.364638
2020-07-03 19:25:54.190629
2021-10-25 13:58:15.422039
2018-04-02 01:11:31.742552
2021-11-02 07:57:43.644335
2018-11-17 14:01:52.738964
2022-10-20 16:02:43.536436
2

2022-06-09 04:01:58.757828
2019-06-27 00:53:25.507233
2021-10-23 21:42:07.366194
2022-04-07 00:18:58.823286
2020-02-24 02:13:14.015445
2017-06-12 08:43:24.302157
2019-12-31 22:16:23.235177
2021-09-16 04:41:39.219531
2021-01-11 18:00:20.098099
2022-04-09 14:44:44.613399
2021-10-14 04:34:20.778929
2018-11-25 11:22:13.680015
2019-12-20 06:35:12.462087
2017-01-25 09:58:58.287117
2019-09-25 12:19:33.134182
2017-10-24 12:44:46.037031
2017-11-06 14:41:11.533301
2018-08-20 09:05:06.021786
2017-04-22 23:08:03.243741
2018-10-21 17:14:22.295254
2017-08-13 04:13:27.684099
2018-05-01 00:20:12.764459
2019-06-06 02:00:34.411246
2018-04-12 23:52:20.571303
2021-08-06 23:07:48.661924
2022-12-28 15:18:16.436667
2022-01-09 23:14:30.386347
2018-12-12 15:06:49.626981
2020-11-06 21:40:26.707022
2019-08-22 07:13:18.616950
2019-04-17 22:31:07.089697
2020-02-18 23:24:41.188803
2017-08-04 17:47:05.109120
2017-06-07 18:21:06.108158
2020-11-07 12:41:30.088309
2018-06-02 17:11:40.300680
2021-04-25 00:40:02.286124
2

2022-12-18 10:11:08.068870
2018-02-19 11:09:53.506866
2019-11-14 07:51:29.369287
2017-06-02 08:23:10.705711
2020-11-10 13:33:05.697610
2020-09-05 00:32:24.701964
2019-11-12 18:46:40.221144
2020-02-09 06:23:19.473972
2018-07-09 09:04:39.990532
2021-08-12 11:55:32.754845
2018-05-25 14:00:29.353862
2020-11-27 14:24:42.566030
2021-04-08 09:40:42.319055
2022-05-06 03:36:00.815407
2020-11-22 01:05:41.615932
2021-07-23 10:10:27.924908
2020-04-28 08:56:27.333304
2018-10-20 17:47:55.312093
2019-09-12 22:23:57.233241
2020-05-16 17:25:28.159802
2018-11-16 14:25:06.485115
2022-03-19 20:47:38.406173
2020-02-02 12:43:16.865452
2020-11-08 08:45:04.573733
2020-05-24 15:52:03.136572
2018-10-05 11:29:47.976127
2018-06-04 05:02:41.925927
2020-05-29 13:46:27.997154
2018-12-05 03:28:04.251724
2018-07-26 23:17:43.395033
2020-03-17 23:49:06.010835
2018-04-11 22:43:28.909541
2020-12-08 22:15:14.898240
2021-03-10 14:33:12.416866
2018-02-14 02:14:12.590346
2022-08-01 00:05:51.949914
2020-07-13 01:40:05.393002
2

2020-11-23 11:42:56.707303
2017-03-13 03:07:08.298116
2021-12-01 02:20:34.706688
2020-09-29 20:28:35.754850
2019-01-19 03:51:35.915889
2018-05-04 00:14:38.108861
2022-07-16 16:15:00.726814
2020-12-26 21:12:37.181635
2022-08-08 12:47:32.932040
2021-04-21 20:23:27.277426
2022-06-19 09:45:14.480211
2020-05-28 06:38:16.413361
2022-08-14 15:48:17.932380
2020-11-13 22:26:38.409657
2022-10-24 23:18:32.478700
2018-10-04 20:19:00.514378
2021-11-30 17:45:37.206983
2021-05-11 03:29:14.705492
2020-06-13 09:06:34.055014
2021-03-10 18:41:20.363357
2020-04-25 15:10:23.491230
2022-01-13 15:12:06.498720
2017-03-16 20:04:27.767166
2021-11-10 21:34:09.215150
2021-05-03 07:58:12.466282
2022-12-18 12:20:18.481528
2018-07-28 15:06:01.989367
2021-10-06 03:18:11.563122
2020-02-03 03:49:10.679979
2021-11-23 00:10:32.978442
2022-05-02 07:19:52.628239
2022-09-06 04:35:22.303477
2019-12-30 21:09:01.000497
2018-05-20 14:29:01.718251
2021-07-12 19:19:25.227206
2017-05-19 01:29:33.346623
2018-04-20 14:27:42.481472
2

2021-07-13 07:25:57.027419
2017-09-04 17:42:32.000892
2017-09-22 15:51:30.668427
2021-03-10 16:23:59.289211
2017-05-23 19:31:22.944696
2017-04-17 22:02:09.224751
2018-07-16 05:30:55.038400
2021-05-05 03:31:27.014192
2020-10-28 04:20:26.966464
2020-06-12 18:05:00.081745
2021-01-27 19:27:50.913507
2018-09-03 17:51:36.444956
2019-06-09 10:53:43.535256
2018-09-05 02:42:45.255127
2021-04-18 02:24:10.955836
2019-07-07 20:23:22.370479
2017-11-28 09:28:53.536449
2019-06-26 09:10:00.754273
2017-06-06 04:11:15.680950
2017-01-24 20:19:51.856762
2018-07-15 08:59:53.714877
2017-07-19 02:01:53.433689
2022-02-25 00:34:04.461349
2022-11-09 18:09:04.228516
2019-05-01 15:57:17.442133
2017-02-09 20:14:09.497101
2019-07-25 19:53:16.778290
2021-02-03 18:47:15.719622
2019-06-19 09:30:47.901048
2022-04-22 18:22:05.229713
2018-08-21 19:44:01.955751
2018-06-14 16:31:11.777251
2020-07-20 08:23:37.090950
2022-05-11 09:12:33.896504
2022-12-14 03:09:08.844563
2017-07-28 20:25:23.272859
2019-12-18 04:32:55.715571
2

2018-04-07 22:57:10.009182
2022-02-16 16:44:29.921924
2020-07-21 10:47:27.367834
2017-07-26 13:48:35.435275
2017-06-24 21:52:12.095595
2019-08-24 08:04:53.163349
2022-05-15 21:59:24.699207
2018-09-09 13:23:06.923684
2021-01-14 02:48:46.854527
2020-01-19 15:37:58.880646
2019-10-29 00:15:54.290920
2021-05-18 14:02:48.045676
2017-07-29 16:32:02.454820
2021-07-03 16:36:24.886878
2019-06-08 17:01:44.664940
2017-02-01 13:36:27.635205
2018-03-28 00:30:37.066689
2021-11-05 12:50:41.106549
2019-01-26 07:13:06.778243
2021-05-02 00:37:47.012983
2017-12-02 17:08:51.993099
2021-09-25 05:42:03.741243
2020-02-14 05:25:41.643235
2017-04-07 17:24:37.335656
2022-09-05 02:24:44.685886
2018-10-23 05:38:41.943033
2020-04-08 15:42:36.180927
2019-07-28 05:19:44.936197
2017-11-05 14:49:53.844047
2021-03-07 19:48:30.197202
2021-07-08 14:44:16.052239
2022-07-17 15:15:51.231976
2018-03-28 02:53:28.242391
2017-11-22 00:35:58.049572
2018-12-18 23:17:06.784962
2020-09-26 20:11:54.960072
2017-04-05 18:29:16.719407
2

2022-07-30 07:09:54.325828
2017-04-03 13:25:38.333217
2022-03-03 08:27:19.290623
2019-09-14 11:33:40.804109
2017-01-10 19:50:47.507583
2017-12-05 12:20:10.806632
2022-06-02 06:40:37.675459
2018-11-08 00:02:32.040117
2019-11-12 14:16:28.851178
2018-03-20 10:25:56.646505
2021-01-15 09:27:37.615410
2017-02-11 06:52:37.590688
2017-10-08 03:56:41.270912
2021-04-25 00:59:03.471723
2017-09-18 03:43:42.693560
2022-04-18 09:16:20.370582
2021-05-09 01:29:25.089832
2020-03-04 20:03:00.450088
2017-07-28 20:33:10.949026
2021-01-30 22:32:27.211330
2017-04-13 04:51:16.053081
2022-12-14 10:30:29.546440
2019-04-27 19:02:09.291656
2022-10-17 17:54:25.485842
2018-01-16 06:54:03.294937
2022-09-03 21:43:49.146586
2019-11-09 22:26:09.784322
2021-10-25 20:03:54.365872
2021-02-11 03:36:44.632339
2021-09-09 06:26:58.300697
2020-01-01 01:22:11.381686
2019-09-02 05:12:16.927797
2022-03-19 12:32:59.855596
2021-07-31 12:30:28.085048
2019-02-20 21:24:09.403406
2022-05-26 22:16:42.658969
2018-05-08 02:38:31.614801
2

2021-08-23 15:09:10.926443
2018-01-06 05:39:41.579148
2021-10-09 06:30:31.313735
2019-04-04 06:39:16.964262
2021-01-08 23:44:57.341173
2020-06-17 18:36:29.086008
2017-02-05 02:28:20.926839
2018-10-30 07:07:19.410036
2018-07-26 20:32:45.557067
2019-03-31 09:20:26.891865
2019-10-12 21:17:27.002158
2020-10-25 15:20:43.044428
2022-07-13 01:16:26.397406
2018-04-17 08:02:43.748815
2017-09-28 01:28:32.688587
2018-11-16 09:20:37.673581
2021-10-20 21:44:46.932724
2017-06-20 15:33:02.891106
2021-01-20 17:12:06.468116
2019-07-28 10:35:33.920675
2018-07-17 09:48:11.367565
2021-02-24 23:26:03.155771
2021-10-20 10:33:19.463836
2018-03-15 10:43:37.215097
2020-02-22 03:23:56.620314
2018-05-14 07:15:35.121627
2018-05-15 15:43:31.593112
2022-07-21 19:44:23.486847
2017-04-06 19:18:51.113443
2019-10-09 14:09:46.260954
2019-05-16 11:29:25.432610
2021-08-24 01:45:49.816725
2020-10-22 01:12:00.873367
2018-12-02 17:47:18.933960
2020-12-27 22:25:08.690039
2019-05-24 18:51:22.520866
2020-04-13 15:43:37.345110
2

2021-01-24 03:00:22.723299
2020-08-12 07:42:24.388192
2017-05-28 10:52:15.001553
2022-11-16 16:08:49.965750
2020-06-17 15:19:09.597568
2018-04-07 12:01:27.516028
2018-04-08 02:55:15.235405
2022-12-15 01:36:31.001332
2022-01-29 23:39:27.794990
2022-08-10 21:21:31.449130
2019-11-04 10:57:36.792674
2020-02-26 18:46:04.774822
2021-12-01 15:43:30.782770
2020-03-28 00:57:51.951723
2021-01-30 20:40:43.243355
2020-03-03 08:18:51.276759
2019-04-08 02:19:02.795637
2018-07-16 15:33:50.232855
2020-05-03 19:53:55.749096
2019-09-21 21:53:33.897556
2022-04-05 11:23:00.820706
2021-07-20 17:38:42.718303
2020-08-01 09:20:23.288249
2019-04-16 07:36:47.251899
2018-04-17 05:40:49.767541
2017-08-07 02:43:18.013876
2018-06-09 04:44:56.137237
2018-11-08 23:32:11.676792
2017-04-04 12:59:53.439963
2018-02-07 16:26:18.030891
2018-06-09 12:16:57.258621
2021-07-22 19:27:41.976108
2022-03-15 19:08:33.839176
2019-01-27 19:41:24.831168
2017-09-13 12:38:24.900187
2021-08-19 06:29:15.345779
2021-11-13 00:09:33.350238
2

2017-07-28 23:54:37.241988
2020-09-14 20:05:27.626392
2021-01-30 15:14:11.342031
2019-06-17 17:45:27.336557
2020-05-27 13:16:56.136069
2017-03-05 17:48:25.186429
2017-08-16 21:31:10.699575
2017-02-07 02:04:30.903134
2017-11-17 13:19:26.291719
2019-04-07 19:06:11.753765
2019-12-24 12:24:06.815503
2018-10-27 02:52:44.736689
2021-08-17 04:22:16.462212
2018-03-17 14:48:45.660313
2022-03-06 13:37:40.443287
2017-02-15 06:01:24.077936
2018-06-25 14:40:18.915888
2021-03-15 05:35:21.267192
2020-07-24 18:17:31.670825
2017-12-31 23:33:40.694128
2020-10-29 02:08:19.603320
2019-01-15 17:05:13.072795
2021-04-21 21:46:29.248396
2022-02-25 12:26:58.372527
2019-09-09 09:55:40.256217
2022-10-06 12:25:34.721402
2017-09-30 10:02:27.835888
2021-04-09 03:49:55.127030
2019-03-14 18:32:42.383936
2018-11-20 08:12:27.315362
2019-10-29 16:14:52.013488
2021-05-05 06:55:49.299597
2017-11-27 10:32:23.240588
2020-10-22 10:18:27.073571
2020-02-20 23:22:45.545279
2020-11-28 08:18:13.975524
2020-05-29 13:38:32.516420
2

2021-09-22 11:33:46.340654
2019-05-20 03:03:47.896383
2019-07-16 00:51:04.744792
2017-06-10 08:55:41.980783
2019-02-12 01:51:38.934477
2022-05-16 20:04:38.237322
2019-12-16 14:23:52.054347
2022-06-05 19:11:53.843593
2022-05-30 18:48:34.573842
2022-05-18 05:10:05.172273
2021-01-02 01:25:12.870420
2022-05-20 07:19:49.138821
2021-07-25 05:57:45.906409
2017-09-26 01:05:50.299438
2022-07-30 04:12:11.367776
2019-07-21 10:04:41.920779
2019-05-13 13:18:27.663358
2020-01-04 09:29:58.609235
2017-05-08 19:02:16.836928
2020-07-01 04:08:00.653316
2019-03-07 20:15:29.528117
2017-06-09 12:02:33.219471
2020-01-14 06:22:38.483133
2020-06-02 22:36:00.909995
2017-01-12 22:22:31.011424
2021-09-09 07:45:40.752489
2022-03-22 10:29:53.923769
2019-12-15 01:34:40.766329
2019-04-23 18:03:05.745986
2017-01-28 05:15:57.942904
2017-03-17 17:58:59.968236
2019-08-23 13:26:48.232696
2018-05-21 14:28:02.743703
2020-12-17 23:43:36.877030
2021-01-23 15:29:08.198382
2020-11-16 13:20:03.716753
2022-07-19 15:36:49.227000
2

2018-08-25 23:43:40.932888
2017-09-29 15:23:46.649778
2019-03-17 13:01:17.711921
2022-07-06 22:30:55.338612
2019-09-02 03:03:09.904919
2017-07-04 14:37:50.260869
2018-09-13 03:18:07.147475
2020-12-26 19:09:04.498319
2021-06-30 12:03:48.549012
2018-01-23 21:45:05.129726
2017-11-22 17:36:53.617541
2021-07-18 12:40:18.157404
2019-12-12 01:50:48.031545
2021-03-15 00:58:49.376482
2021-07-11 02:03:03.710620
2021-11-22 22:07:29.016129
2017-09-30 13:19:48.995766
2019-11-13 11:34:51.782146
2020-10-11 11:24:28.669540
2018-12-15 01:24:16.660848
2021-04-07 23:51:42.065464
2022-11-27 01:04:14.597706
2018-11-27 03:19:18.133003
2022-06-15 08:28:33.822697
2018-03-23 06:11:21.233097
2017-12-26 19:44:45.198952
2017-08-01 18:26:07.935065
2018-06-19 17:29:53.037735
2021-12-25 14:15:29.055736
2021-03-17 10:46:15.561000
2018-10-28 19:09:10.718266
2020-04-06 10:41:09.152563
2018-07-20 14:11:56.302566
2022-10-22 23:04:35.625210
2017-02-18 08:45:51.772239
2022-06-08 08:31:44.233577
2019-12-29 20:08:38.493080
2

2018-02-26 06:11:37.754371
2022-12-14 10:07:35.965217
2017-07-04 22:01:31.500974
2020-04-18 07:02:29.131378
2019-04-12 03:34:56.871568
2019-11-22 04:41:20.986557
2021-07-10 13:39:57.713086
2018-10-13 23:29:21.226161
2018-05-11 13:25:20.831509
2018-06-04 07:54:05.445266
2019-05-01 02:19:40.224761
2019-07-25 07:54:48.020471
2017-01-30 12:17:30.367960
2017-07-29 16:59:08.240557
2017-06-03 21:12:01.328735
2017-08-12 13:26:32.263847
2018-12-15 18:50:56.889142
2020-03-01 11:21:33.673037
2020-11-02 07:02:24.989981
2021-01-01 09:25:54.795901
2020-07-29 03:15:55.873551
2020-09-21 10:09:16.444500
2022-12-30 23:46:30.807551
2018-07-01 05:20:28.814752
2022-07-29 21:03:38.226889
2018-03-29 09:11:55.006829
2020-11-28 09:22:36.209476
2017-12-27 17:55:50.791870
2021-04-24 07:10:46.576690
2017-08-10 23:22:10.241971
2022-03-07 12:50:38.052173
2020-02-11 00:45:58.236143
2018-08-07 06:02:08.912864
2018-01-30 01:35:49.385497
2020-11-09 22:55:33.353554
2019-05-11 08:54:57.690746
2022-10-29 23:17:07.498227
2

In [17]:
#Creating a ride date column
ride_sharing['ride_date'] = random_date

### Back to the future
A new update to the data pipeline feeding into the ride_sharing DataFrame has been updated to register each ride's date. This information is stored in the ride_date column of the type object, which represents strings in pandas.

A bug was discovered which was relaying rides taken today as taken next year. To fix this, you will find all instances of the ride_date column that occur anytime in the future, and set the maximum possible value of this column to today's date. Before doing so, you would need to convert ride_date to a datetime object.

The datetime package has been imported as dt, alongside all the packages you've been using till now.

In [18]:
import datetime as dt
# Convert ride_date to datetime
ride_sharing['ride_dt'] = pd.to_datetime(ride_sharing['ride_date'])

# Save today's date
today = pd.Timestamp('today')

# Set all in the future to today's date
ride_sharing.loc[ride_sharing['ride_dt'] > today, 'ride_dt'] = today

# Print maximum of ride_dt column
print(ride_sharing['ride_dt'].max())

2020-02-09 00:44:50.196057


In [19]:
#Creating a subset of the dataset 
ride_sharing_sub = ride_sharing.loc[0:77, :]
ride_sharing_sub.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 78 entries, 0 to 77
Data columns (total 16 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   Unnamed: 0       78 non-null     int64         
 1   duration         78 non-null     object        
 2   station_A_id     78 non-null     int64         
 3   station_A_name   78 non-null     object        
 4   station_B_id     78 non-null     int64         
 5   station_B_name   78 non-null     object        
 6   bike_id          78 non-null     int64         
 7   user_type        78 non-null     int64         
 8   user_birth_year  78 non-null     int64         
 9   user_gender      78 non-null     object        
 10  user_type_cat    78 non-null     category      
 11  duration_trim    78 non-null     object        
 12  duration_time    78 non-null     int64         
 13  tire_sizes       78 non-null     category      
 14  ride_date        78 non-null     datetime64[

In [20]:
ride_sharing_sub.columns

Index(['Unnamed: 0', 'duration', 'station_A_id', 'station_A_name',
       'station_B_id', 'station_B_name', 'bike_id', 'user_type',
       'user_birth_year', 'user_gender', 'user_type_cat', 'duration_trim',
       'duration_time', 'tire_sizes', 'ride_date', 'ride_dt'],
      dtype='object')

In [21]:
#Dropping unnecessary columns
cols_to_go = ['Unnamed: 0', 'user_type_cat', 'duration_trim', 'duration_time', 'ride_dt']
ride_sharing_sub = ride_sharing_sub.drop(cols_to_go, axis = 1)

In [22]:
#creating an id for each row
id = extra.id
ride_sharing_sub.insert(loc = 0, column = 'ride_id', value = id)

In [23]:
#Stripping the string 'minutes' from the duration column
ride_sharing_sub['duration'] = ride_sharing_sub['duration'].str.strip('minutes')

In [24]:
#Creating new duration entries. Just wanted to do so, no reason
duration = extra.duration
ride_sharing_sub['duration'] = duration

In [25]:
#Creating new user_birth_year entries. Just wanted to do so, no reason.
user_birth_year = extra.user_birth_year
ride_sharing_sub['user_birth_year'] = user_birth_year

In [26]:
ride_sharing_sub.head()

Unnamed: 0,ride_id,duration,station_A_id,station_A_name,station_B_id,station_B_name,bike_id,user_type,user_birth_year,user_gender,tire_sizes,ride_date
0,0,11,81,Berry St at 4th St,323,Broadway at Kearny,5480,2,1988,Male,26,2020-02-09 00:44:50.196057
1,1,8,3,Powell St BART Station (Market St at 4th St),118,Eureka Valley Recreation Center,5193,2,1988,Male,26,2020-02-09 00:44:50.196057
2,2,11,67,San Francisco Caltrain Station 2 (Townsend St...,23,The Embarcadero at Steuart St,3652,3,1988,Male,27,2020-02-09 00:44:50.196057
3,3,7,16,Steuart St at Market St,28,The Embarcadero at Bryant St,1883,1,1969,Male,27,2020-02-09 00:44:50.196057
4,4,11,22,Howard St at Beale St,350,8th St at Brannan St,4626,2,1986,Male,26,2020-02-09 00:44:50.196057


### Finding duplicates
A new update to the data pipeline feeding into ride_sharing has added the ride_id column, which represents a unique identifier for each ride.

The update however coincided with radically shorter average ride duration times and irregular user birth dates set in the future. Most importantly, the number of rides taken has increased by 20% overnight, leading you to think there might be both complete and incomplete duplicates in the ride_sharing DataFrame.

In this exercise, you will confirm this suspicion by finding those duplicates. A sample of ride_sharing is in your environment, as well as all the packages you've been working with thus far.

In [27]:
# Find duplicates
duplicates = ride_sharing_sub.duplicated(subset = 'ride_id', keep = False)
print(duplicates)

0     False
1     False
2     False
3     False
4     False
      ...  
73    False
74     True
75     True
76     True
77     True
Length: 78, dtype: bool


In [28]:
# Sort your duplicated rides
duplicated_rides = ride_sharing_sub[duplicates].sort_values(by = 'ride_id')
print(duplicated_rides.head())

    ride_id  duration  station_A_id  \
22       33        10             5   
39       33         2            30   
53       55         9            21   
65       55         9            16   
74       71        11            67   

                                       station_A_name  station_B_id  \
22       Powell St BART Station (Market St at 5th St)           356   
39     San Francisco Caltrain (Townsend St at 4th St)           130   
53   Montgomery St BART Station (Market St at 2nd St)            78   
65                            Steuart St at Market St            93   
74  San Francisco Caltrain Station 2  (Townsend St...            90   

                  station_B_name  bike_id  user_type  user_birth_year  \
22   Valencia St at Clinton Park     2165          2             1979   
39      22nd St Caltrain Station     5213          1             1979   
53           Folsom St at 9th St     1502          2             1985   
65  4th St at Mission Bay Blvd S     5392     

In [29]:
# Print relevant columns of duplicated_rides
print(duplicated_rides[['ride_id','duration','user_birth_year']])

    ride_id  duration  user_birth_year
22       33        10             1979
39       33         2             1979
53       55         9             1985
65       55         9             1985
74       71        11             1997
75       71        11             1997
76       89         9             1986
77       89         9             2060


In [30]:
# Drop complete duplicates from ride_sharing
ride_dup = ride_sharing_sub.drop_duplicates()
ride_dup[ride_dup.duplicated(subset = 'ride_id', keep = False)]

Unnamed: 0,ride_id,duration,station_A_id,station_A_name,station_B_id,station_B_name,bike_id,user_type,user_birth_year,user_gender,tire_sizes,ride_date
22,33,10,5,Powell St BART Station (Market St at 5th St),356,Valencia St at Clinton Park,2165,2,1979,Male,26,2020-02-09 00:44:50.196057
39,33,2,30,San Francisco Caltrain (Townsend St at 4th St),130,22nd St Caltrain Station,5213,1,1979,Male,27,2020-02-09 00:44:50.196057
53,55,9,21,Montgomery St BART Station (Market St at 2nd St),78,Folsom St at 9th St,1502,2,1985,Female,27,2020-02-09 00:44:50.196057
65,55,9,16,Steuart St at Market St,93,4th St at Mission Bay Blvd S,5392,2,1985,Male,27,2020-02-09 00:44:50.196057
74,71,11,67,San Francisco Caltrain Station 2 (Townsend St...,90,Townsend St at 7th St,1920,2,1997,Male,27,2020-02-09 00:44:50.196057
75,71,11,21,Montgomery St BART Station (Market St at 2nd St),58,Market St at 10th St,316,2,1997,Female,27,2020-02-09 00:44:50.196057
76,89,9,22,Howard St at Beale St,72,Page St at Scott St,5162,2,1986,Female,27,2020-02-09 00:44:50.196057
77,89,9,21,Montgomery St BART Station (Market St at 2nd St),64,5th St at Brannan St,1299,2,2060,Male,26,2020-02-09 00:44:50.196057


In [31]:
# Create statistics dictionary for aggregation function
statistics = {'user_birth_year': 'min', 'duration': 'mean'}

In [32]:
# Group by ride_id and compute new statistics
ride_unique = ride_dup.groupby('ride_id').agg(statistics).reset_index()
ride_unique

Unnamed: 0,ride_id,user_birth_year,duration
0,0,1988,11
1,1,1988,8
2,2,1988,11
3,3,1969,7
4,4,1986,11
...,...,...,...
69,94,1993,25
70,95,1959,11
71,96,1991,7
72,98,1989,21
