<h1 align="center" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Mad Cyclists
</font>
</h1>
<p style="text-align: center;">
<img src="data/img_cyclists.jpg" alt="Cyclists" style="max-width: 50%; height: auto;">
</p>
<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
Cyclists are very sensitive to temperature. The temperature that cyclists feel is typically related to wind speed and humidity. In this exercise, we aim to help a bike rental business by analyzing a dataset of temperatures from different days, enabling them to rent out more bikes on various days and in different temperatures.
</font>
</p>


<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Dataset
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
By running the cell below, you can read the data for this exercise as a DataFrame. This dataset includes the following columns:
</font>
</p>

<center>
<div style="direction: ltr; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>

<table>
  <tr>
    <th><b>Column</b></th>
    <th><b>Description</b></th>
  </tr>
  <tr>
    <td><code>cnt</code></td>
    <td>Number of bikes rented on the day</td>
  </tr>
  <tr>
    <td><code>t1</code></td>
    <td>Actual measured temperature on the day</td>
  </tr>
  <tr>
    <td><code>t2</code></td>
    <td>Average temperature felt by cyclists</td>
  </tr>
  <tr>
    <td><code>humidity</code></td>
    <td>Air humidity on the day</td>
  </tr>
  <tr>
    <td><code>wind_speed</code></td>
    <td>Wind speed on the day</td>
  </tr>
  <tr>
    <td><code>is_weekend</code></td>
    <td>Whether the day is a non-working day (weekend) or not</td>
  </tr>
  <tr>
    <td><code>season</code></td>
    <td>Which season of the year</td>
  </tr>
</table>

</font>
</div>
</center>

In [None]:
import pandas as pd
import numpy as np

df = pd.read_csv('bikes_borrowed.csv')
df.head()

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
You will learn how to work with the <code>read_csv</code> function in future lessons, but pay attention to the use of <code>head</code> in this code. As mentioned, this function, by default, shows the top 5 rows of the DataFrame.
</font>
</p>

<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 1
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
As you can see, the column names <code>t1</code> and <code>t2</code> are too generic and do not clearly indicate the meaning of their values. Therefore, it is a good idea to rename them.<br>
In the cell below, rename the column <code>t1</code> to <code>t_real</code> and the column <code>t2</code> to <code>t_feels_like</code>.<br>
To rename indices or columns, you can use the <code>rename</code> function as shown in the code below:<br>
</font>
</p>

```python
df.rename(columns={"col1": "new_col1", "col2": "new_col2"}, inplace=True)
```


In [None]:
# To-Do

df.head()

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
Pay attention to the <code>inplace=True</code> parameter. This parameter ensures that the <code>rename</code> function applies changes directly to the original DataFrame, instead of creating a copy of the DataFrame, modifying it, and returning the copy.<br>
To better understand this, you can try removing this parameter and then run <code>df.head()</code> again to check whether the column names have changed or not.
</font>
</p>


<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 2
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
Since cyclists often complain about the weather on windy days, we decided to perform calculations only for days with wind speeds greater than 10.<br>
Therefore, in the DataFrame <code>windy_days_df</code>, store only the data where the wind speed is greater than 10 (excluding 10 itself).
</font>
</p>

In [None]:
windy_days_df = None # To-Do

windy_days_df.head()

<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 3
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
For the remaining calculations, we only need the temperature-related columns. Therefore, for the DataFrame <code>windy_days_df</code>, keep only the columns <code>humidity</code>, <code>t_feels_like</code>, <code>t_real</code>, and <code>wind_speed</code> (from the DataFrame <code>windy_days_df</code>) and ignore the remaining columns.
</font>
</p>
<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
<span style="color:green"><b>Hint:</b></span><br>
Consider the order of the columns as follows:<br>
<code>t_real, t_feels_like, humidity, wind_speed</code>
</font>
</p>

In [None]:
windy_days_df = None # To-Do

windy_days_df.head()

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
In the next steps, we want to modify some values in the DataFrame, so to ensure that the original DataFrame and <code>windy_days_df</code> remain unchanged, we need to use the <code>copy</code> function.<br>
In general, using the assignment operator for DataFrames in pandas behaves like NumPy, creating only a new pointer to the same DataFrame. Therefore, modifying one DataFrame affects the other as well.<br>
Using the <code>copy</code> function allows us to create a separate copy of the original DataFrame, so changes to one do not affect the other.<br>
We will explore this feature further later, but for now, simply run the cell below.
</font>
</p>

In [None]:
temperature_df = windy_days_df.copy()

<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 4
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
To perform better analyses, we need to know how warm a day is relative to the overall temperature range. Follow these steps in order:<br>
1. Store the maximum value of the <code>t_real</code> column in the variable <code>t_max</code>.<br>
2. Store the minimum value of the <code>t_real</code> column in the variable <code>t_min</code>.<br>
3. Add a new column named <code>t_percent</code> to the DataFrame. Use the following formula to normalize and calculate the relative temperature:<br>
</font>
</p>

```python
((temp - min) / (max - min)) * 100
```

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
<span style="color:green"><b>Hint:</b></span><br>
You can perform all operations in this step using NumPy functions. As mentioned, pandas is built on top of NumPy, so all NumPy operations are applicable here as well.
</font>
</p>



In [None]:
t_max = None # To-Do
t_min = None # To-Do
temperature_df['t_percent'] = None # To-Do

temperature_df.head()

<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 5
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
Now, we want to predict the felt temperature based on the available data and replace the values in the <code>t_feels_like</code> column.<br>
Use the following formula to populate this column:<br>
</font>
</p>

$$
t\_feels\_like = t\_real + \frac{humidity \cdot t\_real}{1000} - \frac{wind\_speed}{10} - 2
$$

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
<span style="color:orange"><b>Reminder:</b></span>
You can treat each DataFrame column as a separate Series and manipulate it like a NumPy array.
</font>
</p>

In [None]:
temperature_df['t_feels_like'] = None # To-Do

temperature_df.head()

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
As you can see, the values in the <code>t_feels_like</code> column have changed and are no longer the same as those in the original DataFrame. Now, let’s examine the functionality of the <code>copy</code> function. If you have followed all the steps correctly, the <code>t_feels_like</code> column in the <code>windy_days_df</code> DataFrame should remain unchanged. To verify this, you can compare the values of these two DataFrames using the <code>head</code> function.
</font>
</p>

In [None]:
windy_days_df.head()

<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 6
</font>
</h2>

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
Finally, we want to calculate the accuracy of this prediction. For this, we use the formula known as <a href="https://en.wikipedia.org/wiki/Mean_absolute_error"><i>Mean Absolute Error</i> or <i>MAE</i></a>.<br>
</font>
</p>

$$
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} | y_i - \hat{y}_i |
$$

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
To calculate this value, you can use the <a href="https://numpy.org/doc/stable/reference/generated/numpy.mean.html"><code>np.mean()</code></a> and <a href="https://numpy.org/devdocs/reference/generated/numpy.absolute.html"><code>np.abs()</code></a> functions.<br>
</font>
</p>

$$
\text{difference} = \text{mean}( | t\_feels\_like_{\text{temperature\_df}} - t\_feels\_like_{\text{windy\_days\_df}} | )
$$

<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
Calculate this value and store it in the variable <code>difference</code>.
</font>
</p>

In [None]:
diffrence = None # To-Do
print(diffrence)

<h2 align="left" style="line-height:200%;font-family:Arial;color:#0099cc">
<font face="Arial" color="#0099cc">
Part 7
</font>
</h2>


<p style="direction: ltr; text-align: justify; line-height: 200%; font-family: Arial; font-size: medium">
<font face="Arial" size=3>
In this exercise, we worked with the <code>head(n)</code> function, but let’s explore it in more depth.<br>
Generally, this function is not only used for displaying data but also returns a copy of the first <code>n</code> rows of the DataFrame, which can be stored in a separate DataFrame.<br>
In the DataFrame below, store the first 100 rows of the <code>temperature_df</code> DataFrame.
</font>
</p>

In [None]:
final_df = None # To-Do