In [None]:
## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%204%20-%20Import%20Package%20for%20Learning.png"> Import Packages for Learning

import numpy as np
import pandas as pd

---

## <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Sorting data and Chaining Methods

### Sorting data

In [None]:
You can sort your data based on your column values, either categorical or numerical.
  * Sorting is useful for quick DataFrame visualisation (if your data is small enough to be seen in a small DataFrame) or for preparing the data to be plotted.
* Use **.sort_values()** to sort your data. The documentation is [here](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html)
  * The arguments typically used are: by, ascending and inplace.
* Consider the dataset below. It shows records for diamonds price and their characteristics, like colour, cut, clarity and depth

For this exercise we will be sorting the data based on a few columns

import seaborn as sns
df = sns.load_dataset('diamonds')
df = df.head(100)
df.head()

You are interested in sorting by colour. First, let's see the unique values for this column using `.unique()`

df['color'].unique()

You want to sort it by Color using .sort_values()
  *  The argument ascending is True as default.
* You will notice that it is arranged from D to J

df.sort_values(by=['color'])

At the same time, you may be interested in sorting in descending order, from J to D. For that, set `ascending=False`

df.sort_values(by=['color'],ascending=False)

You may be interested in sorting by a set of values.
  * Let's sort, in this order, by colour and price ascending will be default (True)

df.sort_values(by=['color','price'])

You probably noticed we didn't overwrite the DataFrame, as the `inplace` argument was not used, or the DataFrame was not re-assigned.
* `.sort_values()` accepts `inplace` argument; used in the example below and overwrites the DataFrame

df.sort_values(by=['color','price'], inplace=True)
df

<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%205%20-%20Practice.png"> **PRACTICE**: We will consider the tips dataset for practice. It holds records for waiter tips based on the day of the week, time of day, total bill, gender, if it is a table of smokers or not, and how many people were at the table.


df_practice = sns.load_dataset('tips')
print(f"DataFrame shape: {df_practice.shape}")
df_practice.head(10)

For this practice, sort the DataFrame by size and sex, ascending as False, and don't overwrite the DataFrame. Or, if you like, feel free to choose your own settings.

# Write your code here


---

### <img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%2010-%20Lesson%20Content.png"> Chaining methods

In [None]:
It is common to have successive methods applied to a DataFrame, allowing it to improve readability and make the processing steps more concise. That can be done by wrapping the code in **parenthesis** and assigning one line for each method. 
  *  This is called **chaining methods**
  * In the example below, we perform a set of chained operations and store the result in a DataFrame called `df_processed`

df_processed = (df
                .query("cut in ['Ideal','Premium']")
                .rename(mapper={'price':'FinalPrice'},axis=1)
                .sort_values(by=['cut','FinalPrice'],ascending=False)
                .filter(['cut','color','FinalPrice'])
                .replace(to_replace={"Premium":"Most Amazing Ever"})
                )
df_processed

Please note that the **previous code/command has one line**. Although it doesn't look like it, since we have multiple lines for each method: `.filter()`, `.rename()`, `.sort_values()` etc.
* We will run the same command as above but without the chaining method. The code has the same result, but the readability is very poor

df_processed = df.query("cut in ['Ideal','Premium']").rename(mapper={'price':'FinalPrice'},axis=1).sort_values(by=['cut','FinalPrice'],ascending=False).filter(['cut','color','FinalPrice']).replace(to_replace={"Premium":"Most Amazing Ever"})
df_processed

<img width="3%" height="3%" align="top"  src="https://codeinstitute.s3.amazonaws.com/predictive_analytics/jupyter_notebook_icons/Icon%205%20-%20Practice.png"> **PRACTICE**: We will consider the tips dataset for practice. It holds records for waiter tips based on the day of the week, time of day, total bill, customers' gender, if it is a table of smokers or not, and how many people were at the table.


df_practice = sns.load_dataset('tips')
print(f"DataFrame shape: {df_practice.shape}")
df_practice.head(10)

For this practice, feel free to come up with your own scenario or use the following suggestion.

Using **chaining** , do a query, a filter and a sort on the DataFrame `df_practice`
* query sex is equal to Female, and size is equal to 4
* filter by total_bill and day
* sort values by total_bill and set accending to False



# Write your code here.



---