# Python Code

# Step 1: Loading a file and preparing the data

In the code below, we ask Kaggle to load a number of Python libraries. Python libraries are pre-written chunks of code which we can use to perform complicated tasks. You have already used the Turtle library to draw shapes on your screen. The individual libraries are marked below with comments (#). You do not need to edit the code in the box below.

In [None]:
import matplotlib.pyplot as plt #Matplotlib allows us to draw graphs
import numpy as np #Numpy allows us to perform complex mathematical processes quickly
import pandas as pd #Pandas is another useful set of tools for statistics
import datetime
        
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In the box below, we use the Pandas (pd) library to read the contents of the CSV file "bike_hire.csv" and change the format of the date so that we can work with it later.

In [None]:
#load the bike hire data from the CSV file
data = pd.read_csv("/kaggle/input/london-bike-hire/bike_hire.csv")

# Prepare the data - convert the timestamp string into a date (which will allow us to sort and group later)
data['timestamp'] = pd.to_datetime(data['timestamp'], format='%Y-%m-%d %H:%M:%S')

#look at the data types for the csv file - we need to check that the string dates have been converted successfully
print ("List of data types: \n",data.dtypes)

print("\n\n")

# Step 2 - Transforming the data into daily summaries

Now for a bit of fun! We can use the matplotlib library to make a graph of a selected time period to help us to understand patterns and trends. Our first job is to filter the data, so that we can limit the size of the graph

In [None]:
#set a date range for the graph
filtered_data = data[(data['timestamp']> "2015-01-01") & (data['timestamp']< "2015-02-01")]


#group by and count the totals for each date
graph_data = filtered_data.groupby(filtered_data.timestamp.dt.date)['count'].sum()

print(graph_data)

# Step 2 - Transforming the data into daily summaries

Here, we will plot the two data types on top of each other, to start to look at the relationship between them. 

In order to plot multiple graphs or data types together, you may need to specify different subplots. This can be done using matplotlib and the subplots() function. 

To get two plots to sit on top of each other, you can use the twinx() function. In this case, it will create a second y-axis, shown on the right of the plot, for the temperature scale. 

In [None]:
import matplotlib.pyplot as plt
import matplotlib.dates as md
# specify how the timestamp should look - we shorten this to make it readable
xfmt = md.DateFormatter('%Y-%m')

# create your figure and its first subplot
fig, ax1 = plt.subplots()
# rotate the timestamp labels so they are readable
plt.xticks( rotation=25 )
# set up the 2nd axis
ax2 = ax1.twinx()  

ax1.bar(height=graph_data['count'], x=graph_data['timestamp'], color='blue')
ax1.set_xlabel('Timestamp')
ax1.set_ylabel('Bike hires')

ax2.plot(temp_data['timestamp'], temp_data['t1'], color='red')
ax2.set_ylabel('Temperature')

As you can see from the output above, the data in our original table has been grouped together by day, and the total number of hires for that day have been calculated. 

## Tasks to complete using the code above
Now it's over to you! Can you make the following changes to the data above?

Change the date range of the graph, so that it covers June 2016? (You'll need to change the date range for the 'filtered data'
Change the colour of the bars?Look here for a list of colours: https://matplotlib.org/stable/gallery/color/named_colors.html
Edit the title of the graph to something more meaningful
Change the labels on the x and y axes so that they are meaningful
## Extension
Can you plot a line graph below using the temp_data provided? You can cut & paste the contents of the graph and edit the details.