# Share Price Data Cleaning
The purpose of this notebook is to show how I cleaned and processed data.

I will:
1. Import the data.
2. Once the data is imported, I will clean the data. This includes:
3. Handling missing values. I will check for null values using `.isnull()` and fill them in using `.fillna()` or remove the columns if they will not be needed for the analysis
4. Check for duplicates, `.duplicated()` and removing the duplicates, by using `.drop_duplicates()`
5. Converting data types by using `.astype()`

## Import data

In [4]:
# libraries
import pandas as pd

In [6]:
%%time
df = pd.read_csv('us-shareprices-daily.csv', delimiter=';')
len(df)


CPU times: user 2.01 s, sys: 660 ms, total: 2.67 s
Wall time: 3.01 s


5322568

In [8]:
# Get a quick overview
df.head()

Unnamed: 0,Ticker,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Dividend,Shares Outstanding
0,A,45846,2018-08-07,66.83,67.94,66.63,67.66,64.78,2829039,,319000000.0
1,A,45846,2018-08-08,67.74,68.15,67.34,67.38,64.51,1682000,,319000000.0
2,A,45846,2018-08-09,67.48,67.62,66.61,66.69,63.85,1727776,,319000000.0
3,A,45846,2018-08-10,66.82,66.87,65.93,66.26,63.44,2166251,,319000000.0
4,A,45846,2018-08-13,66.44,66.99,65.67,65.94,63.13,2989306,,319000000.0


In [10]:
# Get detailed information about the DataFrame df
df.info(verbose=True, memory_usage='deep')

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5322568 entries, 0 to 5322567
Data columns (total 11 columns):
 #   Column              Dtype  
---  ------              -----  
 0   Ticker              object 
 1   SimFinId            int64  
 2   Date                object 
 3   Open                float64
 4   High                float64
 5   Low                 float64
 6   Close               float64
 7   Adj. Close          float64
 8   Volume              int64  
 9   Dividend            float64
 10  Shares Outstanding  float64
dtypes: float64(7), int64(2), object(2)
memory usage: 1013.4 MB


It appears that the Dividend column only has 16 values.

In [13]:
df.describe()

Unnamed: 0,SimFinId,Open,High,Low,Close,Adj. Close,Volume,Dividend,Shares Outstanding
count,5322568.0,5322568.0,5322568.0,5322568.0,5322568.0,5322568.0,5322568.0,34181.0,4942387.0
mean,4779305.0,543.3237,562.3159,511.3188,535.8395,523.8236,1830268.0,0.4,260539900.0
std,5219623.0,45231.43,47011.13,41637.17,44436.38,44410.18,30201710.0,0.902603,5489286000.0
min,18.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,498391.0,8.88,9.11,8.62,8.86,8.32,42788.0,0.12,18930590.0
50%,1108872.0,22.1,22.55,21.64,22.08,20.53,260789.0,0.24,49017300.0
75%,10383500.0,54.01,54.99,53.03,54.0,50.9,962208.5,0.45,128329600.0
max,15665000.0,12000000.0,12000000.0,12000000.0,12000000.0,12000000.0,18489980000.0,68.06,1121052000000.0


## Handling missing values

In [16]:
print(df.isnull().sum())

Ticker                      0
SimFinId                    0
Date                        0
Open                        0
High                        0
Low                         0
Close                       0
Adj. Close                  0
Volume                      0
Dividend              5288387
Shares Outstanding     380181
dtype: int64


The Dividend column contains a lot of null values. I do not need it for my analysis so I will define the columns I do want by adding them to a list that will use in the option, `usecols=required_cols`, to `read_csv` call for the final dataframe and when I create my final clean dataset. 

In [20]:
# Define the required columns for the data
required_cols = ['Ticker', 'SimFinId', 'Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Shares Outstanding']

I will go ahead an drop the Dividend column and replace the null values on Shares Outstanding.

In [23]:
# Drop column 'Dividend'
df = df.drop('Dividend', axis=1)

In [25]:
df = df.fillna(0)

In [27]:
# Check info to see if it is dropped
df.info(verbose=True, memory_usage='deep')

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5322568 entries, 0 to 5322567
Data columns (total 10 columns):
 #   Column              Dtype  
---  ------              -----  
 0   Ticker              object 
 1   SimFinId            int64  
 2   Date                object 
 3   Open                float64
 4   High                float64
 5   Low                 float64
 6   Close               float64
 7   Adj. Close          float64
 8   Volume              int64  
 9   Shares Outstanding  float64
dtypes: float64(6), int64(2), object(2)
memory usage: 972.8 MB


In [29]:
print(df.isnull().sum())

Ticker                0
SimFinId              0
Date                  0
Open                  0
High                  0
Low                   0
Close                 0
Adj. Close            0
Volume                0
Shares Outstanding    0
dtype: int64


The Shares Outstanding column now as 0 as its value. I might remove this column if I find I do not need it for my analysis.

## Check for duplicate values 

In [33]:
# df.duplicated().sum()
df.groupby(df.columns.tolist(),as_index=False).size()

Unnamed: 0,Ticker,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Shares Outstanding,size
0,A,45846,2018-08-07,66.83,67.94,66.63,67.66,64.78,2829039,319000000.0,1
1,A,45846,2018-08-08,67.74,68.15,67.34,67.38,64.51,1682000,319000000.0,1
2,A,45846,2018-08-09,67.48,67.62,66.61,66.69,63.85,1727776,319000000.0,1
3,A,45846,2018-08-10,66.82,66.87,65.93,66.26,63.44,2166251,319000000.0,1
4,A,45846,2018-08-13,66.44,66.99,65.67,65.94,63.13,2989306,319000000.0,1
...,...,...,...,...,...,...,...,...,...,...,...
5322563,ZYXI,171401,2023-07-05,9.50,9.54,9.15,9.17,9.17,215455,36435000.0,1
5322564,ZYXI,171401,2023-07-06,9.02,9.18,8.93,9.01,9.01,191404,36435000.0,1
5322565,ZYXI,171401,2023-07-07,9.03,9.29,8.94,9.00,9.00,291326,36435000.0,1
5322566,ZYXI,171401,2023-07-10,9.00,9.23,8.99,9.18,9.18,148425,36435000.0,1


Determine if there are duplicates.

In [36]:
duplicates = df[df.duplicated()] 
print(duplicates)

Empty DataFrame
Columns: [Ticker, SimFinId, Date, Open, High, Low, Close, Adj. Close, Volume, Shares Outstanding]
Index: []


There are no duplicates in the dataset.

## Converting data types
Define what data types will be best so that they don't use too much memory.

In [40]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5322568 entries, 0 to 5322567
Data columns (total 10 columns):
 #   Column              Dtype  
---  ------              -----  
 0   Ticker              object 
 1   SimFinId            int64  
 2   Date                object 
 3   Open                float64
 4   High                float64
 5   Low                 float64
 6   Close               float64
 7   Adj. Close          float64
 8   Volume              int64  
 9   Shares Outstanding  float64
dtypes: float64(6), int64(2), object(2)
memory usage: 406.1+ MB


I tried to change the float64 and int64 down to float16 and int16, but this was too low and adversely affected the data so I reloaded the dataset again and found that float32 and int32, worked best. 

I will change the float and int datatypes to float32 and int32, respectively, and the tickers to category. I changed the Date column manually, but when I bring in the dataset again, I will use `parse_date()` option on the `read_csv` call.

In [43]:
for col in df.columns:
    if df[col].dtype == 'float64':
        df[col] = df[col].astype('float32')
    if df[col].dtype == 'int64':
        df[col] = df[col].astype('int32')
    if df[col].dtype == 'object' and df[col].name == 'Ticker':
        df[col] = df[col].astype('category')
    if df[col].name == 'Date':
        df[col] = df[col].astype('datetime64[ns]')

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5322568 entries, 0 to 5322567
Data columns (total 10 columns):
 #   Column              Dtype         
---  ------              -----         
 0   Ticker              category      
 1   SimFinId            int32         
 2   Date                datetime64[ns]
 3   Open                float32       
 4   High                float32       
 5   Low                 float32       
 6   Close               float32       
 7   Adj. Close          float32       
 8   Volume              int32         
 9   Shares Outstanding  float32       
dtypes: category(1), datetime64[ns](1), float32(6), int32(2)
memory usage: 213.4 MB


The code above loops through the columns and changes the type. The memory usage dropped by almost half.

Run `.head()` and `.describe` again to see if changing the types did anything adversely to the data.

In [46]:
df.head()

Unnamed: 0,Ticker,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Shares Outstanding
0,A,45846,2018-08-07,66.830002,67.940002,66.629997,67.660004,64.779999,2829039,319000000.0
1,A,45846,2018-08-08,67.739998,68.150002,67.339996,67.379997,64.510002,1682000,319000000.0
2,A,45846,2018-08-09,67.480003,67.620003,66.610001,66.690002,63.849998,1727776,319000000.0
3,A,45846,2018-08-10,66.82,66.870003,65.93,66.260002,63.439999,2166251,319000000.0
4,A,45846,2018-08-13,66.440002,66.989998,65.669998,65.940002,63.130001,2989306,319000000.0


In [48]:
df.describe()


Unnamed: 0,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Shares Outstanding
count,5322568.0,5322568,5322568.0,5322568.0,5322568.0,5322568.0,5322568.0,5322568.0,5322568.0
mean,4779305.0,2021-02-25 05:37:53.133722112,543.3233,562.3159,511.3192,535.8393,523.8233,1694702.0,241930100.0
min,18.0,2018-08-07 00:00:00,0.0,0.0,0.0,0.0,0.0,-2125784000.0,0.0
25%,498391.0,2019-12-10 00:00:00,8.88,9.11,8.62,8.86,8.32,42778.0,13708000.0
50%,1108872.0,2021-03-23 00:00:00,22.1,22.55,21.64,22.08,20.53,260750.0,43223180.0
75%,10383500.0,2022-05-17 00:00:00,54.01,54.99,53.03,54.0,50.9,962067.8,117766700.0
max,15665000.0,2023-07-11 00:00:00,12000000.0,12000000.0,12000000.0,12000000.0,12000000.0,2146534000.0,1121052000000.0
std,5219623.0,,45228.12,47007.81,41633.99,44433.1,44407.03,16095910.0,5285733000.0


A dictionary variable will be used in the `read_csv()` call to change datatype when I bring in the clean dataset and I will add an option on the `read_csv` function to change the Date column to the correct datetype when I bring in the data.

## Check data by running queries

In [54]:
# Pull data by Ticker
gme_stock_data = df[df['Ticker'] == 'GME']
gme_stock_data

Unnamed: 0,Ticker,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Shares Outstanding
2053021,GME,44534,2018-08-07,3.730000,3.810000,3.710000,3.780000,3.470000,7813724,407600000.0
2053022,GME,44534,2018-08-08,3.780000,3.830000,3.750000,3.810000,3.500000,8715468,407600000.0
2053023,GME,44534,2018-08-09,3.810000,3.870000,3.750000,3.800000,3.490000,9370924,407600000.0
2053024,GME,44534,2018-08-10,3.790000,3.900000,3.770000,3.830000,3.520000,9666664,407600000.0
2053025,GME,44534,2018-08-13,3.840000,3.850000,3.780000,3.800000,3.490000,8403788,407600000.0
...,...,...,...,...,...,...,...,...,...,...
2054255,GME,44534,2023-07-05,24.639999,24.850000,23.790001,23.900000,23.900000,2268244,304500000.0
2054256,GME,44534,2023-07-06,23.520000,23.570000,22.820000,22.830000,22.830000,2390303,304500000.0
2054257,GME,44534,2023-07-07,22.969999,23.530001,22.670000,22.709999,22.709999,2447203,304500000.0
2054258,GME,44534,2023-07-10,22.610001,23.559999,22.000000,23.540001,23.540001,3318214,304500000.0


In [56]:
# see if the data type change on date was successful

df['Date'] = pd.to_datetime(df['Date'])

# Set a new dataframe with the filtered data for year 2022, reset index
stock_data_2022 = df[df['Date'].dt.year == 2022].reset_index()

stock_data_2022

Unnamed: 0,index,Ticker,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Shares Outstanding
0,858,A,45846,2022-01-03,159.000000,159.440002,153.929993,156.479996,153.809998,1606323,302722656.0
1,859,A,45846,2022-01-04,155.490005,155.630005,149.699997,151.190002,148.610001,2233958,302722656.0
2,860,A,45846,2022-01-05,150.830002,153.100006,148.529999,148.600006,146.070007,2370529,302722656.0
3,861,A,45846,2022-01-06,148.850006,149.960007,145.580002,149.119995,146.580002,2298277,302722656.0
4,862,A,45846,2022-01-07,149.119995,149.729996,145.089996,145.149994,142.679993,2058658,302722656.0
...,...,...,...,...,...,...,...,...,...,...,...
1171069,5322433,ZYXI,171401,2022-12-23,13.630000,14.070000,13.630000,13.810000,13.810000,180392,38046000.0
1171070,5322434,ZYXI,171401,2022-12-27,14.020000,14.090000,13.670000,13.760000,13.760000,143701,38046000.0
1171071,5322435,ZYXI,171401,2022-12-28,13.690000,13.900000,13.630000,13.880000,13.880000,137809,38046000.0
1171072,5322436,ZYXI,171401,2022-12-29,13.950000,14.200000,13.810000,13.870000,13.870000,159746,38046000.0


Determine if my new dataframe has only 2022 dates. 

In [61]:
stock_data_2022['Date'].agg(['min', 'max'])

min   2022-01-03
max   2022-12-30
Name: Date, dtype: datetime64[ns]

I have successfully brought in data from only 2022.

In [64]:
stock_data_2022.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1171074 entries, 0 to 1171073
Data columns (total 11 columns):
 #   Column              Non-Null Count    Dtype         
---  ------              --------------    -----         
 0   index               1171074 non-null  int64         
 1   Ticker              1171074 non-null  category      
 2   SimFinId            1171074 non-null  int32         
 3   Date                1171074 non-null  datetime64[ns]
 4   Open                1171074 non-null  float32       
 5   High                1171074 non-null  float32       
 6   Low                 1171074 non-null  float32       
 7   Close               1171074 non-null  float32       
 8   Adj. Close          1171074 non-null  float32       
 9   Volume              1171074 non-null  int32         
 10  Shares Outstanding  1171074 non-null  float32       
dtypes: category(1), datetime64[ns](1), float32(6), int32(2), int64(1)
memory usage: 56.0 MB


Look at the first 10 records.

In [66]:
stock_data_2022.head(10)

Unnamed: 0,index,Ticker,SimFinId,Date,Open,High,Low,Close,Adj. Close,Volume,Shares Outstanding
0,858,A,45846,2022-01-03,159.0,159.440002,153.929993,156.479996,153.809998,1606323,302722656.0
1,859,A,45846,2022-01-04,155.490005,155.630005,149.699997,151.190002,148.610001,2233958,302722656.0
2,860,A,45846,2022-01-05,150.830002,153.100006,148.529999,148.600006,146.070007,2370529,302722656.0
3,861,A,45846,2022-01-06,148.850006,149.960007,145.580002,149.119995,146.580002,2298277,302722656.0
4,862,A,45846,2022-01-07,149.119995,149.729996,145.089996,145.149994,142.679993,2058658,302722656.0
5,863,A,45846,2022-01-10,143.289993,145.309998,140.860001,145.160004,142.690002,2548145,302722656.0
6,864,A,45846,2022-01-11,145.0,146.940002,143.809998,146.639999,144.139999,2028671,302722656.0
7,865,A,45846,2022-01-12,147.800003,150.389999,147.550003,149.509995,146.960007,2250847,302722656.0
8,866,A,45846,2022-01-13,149.460007,149.539993,144.850006,145.169998,142.690002,1741764,302722656.0
9,867,A,45846,2022-01-14,144.039993,145.149994,142.360001,144.679993,142.210007,2225442,302722656.0


Change the Date column back to an Object type, as it uses less memory.

In [73]:
from datetime import datetime
def convert_datetime(dt):
    return datetime.strftime(dt, '%Y-%m-%d') # Change it back to the original format it came in

stock_data_2022['Date']= stock_data_2022['Date'].apply(convert_datetime)

In [76]:
stock_data_2022.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1171074 entries, 0 to 1171073
Data columns (total 11 columns):
 #   Column              Non-Null Count    Dtype   
---  ------              --------------    -----   
 0   index               1171074 non-null  int64   
 1   Ticker              1171074 non-null  category
 2   SimFinId            1171074 non-null  int32   
 3   Date                1171074 non-null  object  
 4   Open                1171074 non-null  float32 
 5   High                1171074 non-null  float32 
 6   Low                 1171074 non-null  float32 
 7   Close               1171074 non-null  float32 
 8   Adj. Close          1171074 non-null  float32 
 9   Volume              1171074 non-null  int32   
 10  Shares Outstanding  1171074 non-null  float32 
dtypes: category(1), float32(6), int32(2), int64(1), object(1)
memory usage: 56.0+ MB


## Create new CSV file from filtered dataframe

Push the new dataframe that is filtered for 2022 into a new CSV file. 

In [80]:
# Push the new dataframe and convert it to a CSV
stock_data_2022.to_csv('US_Share_Prices_2022.csv', index=False)

Import and inspect new file to make sure that it contains the data that I need.

In [83]:
# Define the required columns for the data
required_cols = ['Ticker', 'SimFinId', 'Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Shares Outstanding']

# Define the data types dictionary for the columns
dt_setup = {
    'Ticker':'category',
    'Open':'float32',
    'High':'float32',
    'Low':'float32',
    'Close':'float32'
}

In [89]:
%%time
new_df = pd.read_csv('US_Share_Prices_2022.csv', delimiter=',', usecols=required_cols, dtype=dt_setup, parse_dates=[2])
len(df)


CPU times: user 470 ms, sys: 69.4 ms, total: 540 ms
Wall time: 568 ms


5322568

In [91]:
new_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1171074 entries, 0 to 1171073
Data columns (total 9 columns):
 #   Column              Non-Null Count    Dtype         
---  ------              --------------    -----         
 0   Ticker              1171074 non-null  category      
 1   SimFinId            1171074 non-null  int64         
 2   Date                1171074 non-null  datetime64[ns]
 3   Open                1171074 non-null  float32       
 4   High                1171074 non-null  float32       
 5   Low                 1171074 non-null  float32       
 6   Close               1171074 non-null  float32       
 7   Volume              1171074 non-null  int64         
 8   Shares Outstanding  1171074 non-null  float64       
dtypes: category(1), datetime64[ns](1), float32(4), float64(1), int64(2)
memory usage: 56.0 MB


The new dataframe with the filtered data has been pulled in correctly.

In [94]:
new_df.head(10)

Unnamed: 0,Ticker,SimFinId,Date,Open,High,Low,Close,Volume,Shares Outstanding
0,A,45846,2022-01-03,159.0,159.440002,153.929993,156.479996,1606323,302722660.0
1,A,45846,2022-01-04,155.490005,155.630005,149.699997,151.190002,2233958,302722660.0
2,A,45846,2022-01-05,150.830002,153.100006,148.529999,148.600006,2370529,302722660.0
3,A,45846,2022-01-06,148.850006,149.960007,145.580002,149.119995,2298277,302722660.0
4,A,45846,2022-01-07,149.119995,149.729996,145.089996,145.149994,2058658,302722660.0
5,A,45846,2022-01-10,143.289993,145.309998,140.860001,145.160004,2548145,302722660.0
6,A,45846,2022-01-11,145.0,146.940002,143.809998,146.639999,2028671,302722660.0
7,A,45846,2022-01-12,147.800003,150.389999,147.550003,149.509995,2250847,302722660.0
8,A,45846,2022-01-13,149.460007,149.539993,144.850006,145.169998,1741764,302722660.0
9,A,45846,2022-01-14,144.039993,145.149994,142.360001,144.679993,2225442,302722660.0


In [96]:
new_df.describe()

Unnamed: 0,SimFinId,Date,Open,High,Low,Close,Volume,Shares Outstanding
count,1171074.0,1171074,1171074.0,1171074.0,1171074.0,1171074.0,1171074.0,1171074.0
mean,5383653.0,2022-07-02 07:52:53.044231168,178.5455,181.7529,174.5826,177.7829,1667077.0,301973300.0
min,18.0,2022-01-03 00:00:00,0.0,0.0,0.0,0.0,-2125784000.0,0.0
25%,627777.0,2022-04-01 00:00:00,7.44,7.67,7.2,7.42,38997.0,16325000.0
50%,1299451.0,2022-07-05 00:00:00,19.19,19.62,18.76,19.17,259976.0,47187580.0
75%,11034050.0,2022-09-30 00:00:00,49.05,50.0,48.08,49.04,957031.0,127195200.0
max,15665000.0,2022-12-30 00:00:00,544389.2,544389.2,533345.1,539180.0,2094306000.0,339741600000.0
std,5430753.0,,6925.252,6994.762,6820.709,6903.116,12483390.0,5239273000.0


Check the dates:

In [101]:
stock_data_2022['Date'].agg(['min', 'max'])

min    2022-01-03
max    2022-12-30
Name: Date, dtype: object

## Future Code for bringing in a updated CSV from SimFin
From the analysis of the existing data, I will use defined required columns and data types to create a new file to work from that will also pull only the filtered data for 2022.

In [None]:
# Define the required columns for the data
required_cols = ['Ticker', 'SimFinId', 'Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Shares Outstanding']

In [None]:
# Define the data types for the columns
dt_setup = {
    'Ticker':'category',
    'Open':'float32',
    'High':'float32',
    'Low':'float32',
    'Close':'float32'
}

In [10]:
%%time
# Use nrows option
df = pd.read_csv('us-shareprices-daily.csv', delimiter=';', nrows=1000, usecols=required_cols, dtype=dt_setup, date_parse(2))
len(df)


CPU times: user 10.8 ms, sys: 5.05 ms, total: 15.9 ms
Wall time: 18.7 ms


1000

In [None]:
# Fill na to zero the column, Shares Outstanding 
df = df.fillna(0)

Code for converting dataframe to JSON and compressed JSON:

In [63]:
# df.to_json('US_Share_Prices_2022.json', orient='records', lines=True)

In [None]:
# df.to_json('US_Share_Prices_2022.json.gz', orient='records', lines=True, compression='gzip')

In [148]:
# List out dates
df['Date'].dt.date

0      2018-08-07
1      2018-08-08
2      2018-08-09
3      2018-08-10
4      2018-08-13
          ...    
995    2022-07-21
996    2022-07-22
997    2022-07-25
998    2022-07-26
999    2022-07-27
Name: Date, Length: 1000, dtype: object

In [152]:
data_for_desired_year = df[df['Date'].dt.year == 2022]


In [154]:
print(data_for_desired_year)

    Ticker  SimFinId       Date        Open        High         Low  \
858      A    -19690 2022-01-03  159.000000  159.440002  153.929993   
859      A    -19690 2022-01-04  155.490005  155.630005  149.699997   
860      A    -19690 2022-01-05  150.830002  153.100006  148.529999   
861      A    -19690 2022-01-06  148.850006  149.960007  145.580002   
862      A    -19690 2022-01-07  149.119995  149.729996  145.089996   
..     ...       ...        ...         ...         ...         ...   
995      A    -19690 2022-07-21  122.120003  127.339996  122.120003   
996      A    -19690 2022-07-22  127.410004  128.539993  124.309998   
997      A    -19690 2022-07-25  125.050003  125.410004  123.190002   
998      A    -19690 2022-07-26  124.660004  125.629997  123.370003   
999      A    -19690 2022-07-27  125.040001  128.330002  124.709999   

          Close  Volume  
858  156.479996  -32077  
859  151.190002    5734  
860  148.600006   11233  
861  149.119995    4517  
862  145.149994  