# Inspecting Time Zones in Stock Data

In this activity, you’ll load historical stock data from Tesla Motors (TSLA) to practice your `datetime` data transformation skills.

Instructions:

1. Read the Tesla historical stock data from the CSV file into a DataFrame.

2. Use the Pandas `head` function to inspect the data. Use the Pandas `info` function to check the data types of each column.

3. Convert the “time” column to the `datetime` data type by using the Pandas `to_datetime` function.

    > **Hint** Because the “time” column contains UTC `Timestamp` data, remember to set `utc=True`.

4. Use the Pandas `head` and `info` functions to verify the data type transformation and the time zone.

5. Convert the time zone to that of Berlin (`Europe/Berlin`), and verify the time zone transformation by using the Pandas `head` and `info` functions.


References:

[Pandas.to_datetime](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html)

[Pandas.dt.tz_convert](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dt.tz_convert.html)

[Python time zones](https://pvlib-python.readthedocs.io/en/stable/timetimezones.html)


In [32]:
# Import the required libraries and dependencies.
import pandas as pd
from pathlib import Path

## Step 1: Read the Tesla historical stock data from the CSV file into a DataFrame.

In [33]:
# Read the data from the tsla_historical.csv file into a Pandas DataFrame
df_tsla = pd.read_csv(Path("../Resources/tsla_historical.csv"))
df_tsla.head


<bound method NDFrame.head of                             time    close
0      2018-01-02 09:30:00-05:00  315.870
1      2018-01-02 09:45:00-05:00  317.500
2      2018-01-02 10:00:00-05:00  318.035
3      2018-01-02 10:15:00-05:00  317.470
4      2018-01-02 10:30:00-05:00  316.875
...                          ...      ...
21274  2020-09-29 15:15:00-04:00  418.980
21275  2020-09-29 15:30:00-04:00  418.950
21276  2020-09-29 15:45:00-04:00  418.990
21277  2020-09-29 16:00:00-04:00  417.820
21278  2020-09-29 17:45:00-04:00  418.490

[21279 rows x 2 columns]>

## Step 2. Use the Pandas `head` function to inspect the data. Use the Pandas `info` function to check the data types of each column.

In [34]:
# Display the first five rows of the DataFrame
# YOUR CODE HERE
df_tsla.head

<bound method NDFrame.head of                             time    close
0      2018-01-02 09:30:00-05:00  315.870
1      2018-01-02 09:45:00-05:00  317.500
2      2018-01-02 10:00:00-05:00  318.035
3      2018-01-02 10:15:00-05:00  317.470
4      2018-01-02 10:30:00-05:00  316.875
...                          ...      ...
21274  2020-09-29 15:15:00-04:00  418.980
21275  2020-09-29 15:30:00-04:00  418.950
21276  2020-09-29 15:45:00-04:00  418.990
21277  2020-09-29 16:00:00-04:00  417.820
21278  2020-09-29 17:45:00-04:00  418.490

[21279 rows x 2 columns]>

In [35]:
# Inspect the DataFrame's data types using the info function
# YOUR CODE HERE
df_tsla.info

<bound method DataFrame.info of                             time    close
0      2018-01-02 09:30:00-05:00  315.870
1      2018-01-02 09:45:00-05:00  317.500
2      2018-01-02 10:00:00-05:00  318.035
3      2018-01-02 10:15:00-05:00  317.470
4      2018-01-02 10:30:00-05:00  316.875
...                          ...      ...
21274  2020-09-29 15:15:00-04:00  418.980
21275  2020-09-29 15:30:00-04:00  418.950
21276  2020-09-29 15:45:00-04:00  418.990
21277  2020-09-29 16:00:00-04:00  417.820
21278  2020-09-29 17:45:00-04:00  418.490

[21279 rows x 2 columns]>

## Step 3: Convert the “time” column to the `datetime` data type by using the Pandas `to_datetime` function.

> **Hint** Because the “time” column contains UTC `Timestamp` data, remember to set `utc=True`.


In [36]:
# Transform the time column to a datetime data type
#df_tsla["time"] = pd.to_datetime(df_tsla["time"](utc=True))
df_tsla["time"] = pd.to_datetime(
    df_tsla["time"],
    infer_datetime_format=True,
    utc = True
)

In [37]:
df_tsla.dtypes

time     datetime64[ns, UTC]
close                float64
dtype: object

## Step 4: Use the Pandas `head` and `info` functions to verify the data type transformation and the time zone

In [38]:
# Display the first five rows of the DataFrame to confirm
# changes to the time column
# YOUR CODE HERE
df_tsla.head

<bound method NDFrame.head of                            time    close
0     2018-01-02 14:30:00+00:00  315.870
1     2018-01-02 14:45:00+00:00  317.500
2     2018-01-02 15:00:00+00:00  318.035
3     2018-01-02 15:15:00+00:00  317.470
4     2018-01-02 15:30:00+00:00  316.875
...                         ...      ...
21274 2020-09-29 19:15:00+00:00  418.980
21275 2020-09-29 19:30:00+00:00  418.950
21276 2020-09-29 19:45:00+00:00  418.990
21277 2020-09-29 20:00:00+00:00  417.820
21278 2020-09-29 21:45:00+00:00  418.490

[21279 rows x 2 columns]>

In [39]:
# Use the info function to confirm the change in data type 
# for the time column
# YOUR CODE HERE
df_tsla.dtypes

time     datetime64[ns, UTC]
close                float64
dtype: object

## Step 5: Convert the time zone to that of Berlin (`Europe/Berlin`), and verify the time zone transformation by using the Pandas `head` and `info` functions.

In [44]:
# View the first five rows of the DataFrame to confirm the
# conversion of the time column
# YOUR CODE HERE
df_tsla.head(5)

Unnamed: 0,time,close
0,2018-01-02 15:30:00+01:00,315.87
1,2018-01-02 15:45:00+01:00,317.5
2,2018-01-02 16:00:00+01:00,318.035
3,2018-01-02 16:15:00+01:00,317.47
4,2018-01-02 16:30:00+01:00,316.875


In [45]:
# Use the info function to confirm the change in the time zone
# associated with the time column
# YOUR CODE HERE
df_tsla.info(["time"])

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21279 entries, 0 to 21278
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype                        
---  ------  --------------  -----                        
 0   time    21279 non-null  datetime64[ns, Europe/Berlin]
 1   close   21279 non-null  float64                      
dtypes: datetime64[ns, Europe/Berlin](1), float64(1)
memory usage: 332.6 KB
