# Python Code

# Step 1: Loading a file and preparing the data

1. In the code below, we ask Kaggle to load a number of Python libraries. Python libraries are pre-written chunks of code which we can use to perform complicated tasks. You have already used the Turtle library to draw shapes on your screen. The individual libraries are marked below with comments (#). You ***do not*** need to edit the code in the box below.

In [None]:
import matplotlib.pyplot as plt #Matplotlib allows us to draw graphs
import numpy as np #Numpy allows us to perform complex mathematical processes quickly
import pandas as pd #Pandas is another useful set of tools for statistics
import datetime
        
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

2. In box [12] below, we use the Pandas library to read the contents of the CSV file "bike_hire.csv" and convert the date formats so that we can work with it later

In [None]:
#load the bike hire data from the CSV file
data = pd.read_csv("/kaggle/input/london-bike-hire/bike_hire.csv")

# Prepare the data - convert the timestamp string into a date (which will allow us to sort and group later)
data['timestamp'] = pd.to_datetime(data['timestamp'], format='%Y-%m-%d %H:%M:%S')

#look at the data types for the csv file - we need to check that the string dates have been converted successfully
print ("List of data types: \n",data.dtypes)

print("\n\n")


# Step 2 - Plotting a Graph Using Matplotlib

1. Now for a bit of fun! We can use the matplotlib library to make a graph of a selected time period to help us to understand patterns and trends. Our first job is to filter the data, so that we can limit the size of the graph

In [None]:
#set a date range for the graph
filtered_data = data[(data['timestamp']> "2016-06-01") & (data['timestamp']< "2016-07-01")]


#group by and count the totals for each date
graph_data = filtered_data.groupby(filtered_data.timestamp.dt.date)['count'].sum()

print(graph_data)



^^ As you can see from the output above, the data in our original table has been grouped together by day, and the total number of hires for that day have been calculated

In [None]:
#Define the size of the graph area
plt.figure(figsize = (10,10))

#define the type of graph, colour and the data to be used on the x and y axes
graph_data.plot(kind ='bar',x='timestamp',y='count',color='red',title = 'Fig 1',xlabel = 'Series 1',ylabel = 'Series 2')

#display the graph on screen below vv
plt.show()

# Tasks to complete using the code above

Now it's over to you! Can you make the following changes to the data above?

1. Change the date range of the graph, so that it covers June 2016? (You'll need to change the date range for the 'filtered data'
2. Change the colour of the bars?Look here for a list of colours: https://matplotlib.org/stable/gallery/color/named_colors.html
3. Edit the title of the graph to something more meaningful
4. Change the labels on the x and y axes so that they are meaningful

# Extension

Can you plot a line graph below using the temp_data provided? You can cut & paste the contents of the graph and edit the details.




In [None]:
#Create a dataframe containing the highest temperature data for each day
temp_data = filtered_data.groupby(filtered_data.timestamp.dt.date)['t1'].max()
print(temp_data)
