### Data Transformation Using Pandas

First, we import the pandas library, which is essential for working with DataFrames.

In [48]:
import pandas as pd

We load a dataset from a CSV file into a pandas DataFrame. This is the first step in any data analysis task.

In [49]:
df = pd.read_csv('new_csv.csv')

The `.head()` method is used to display the first few rows of the DataFrame. This is a quick way to inspect the data and understand its structure.

In [50]:
df.head()

Sorting the DataFrame by the 'city' column in descending order. This helps in organizing the data for better analysis.

In [51]:
df.sort_values('city' , ascending=False)

Here, we sort the DataFrame by multiple columns, first by 'city' and then by 'payment_method', and create a copy to work with.

In [52]:
df2 = df.sort_values(['city','payment_method']).copy()

After sorting, the index can become disordered. `reset_index` is used to create a new, clean index. `drop=True` prevents the old index from being added as a column, and `inplace=True` modifies the DataFrame directly.

In [53]:
df2.reset_index(drop=True,inplace=True)

The `rank` function is used to assign a rank to each customer based on their ID. `method='dense'` ensures that ranks are consecutive without any gaps.

In [55]:
df2['Rank'] = df['customer_id'].rank(ascending = True,method='dense')

The 'Unnamed: 0' column appears to be an artifact from the CSV file. We drop it using the `drop` method as it is not needed for our analysis.

In [84]:
df.drop(columns=['Unnamed: 0'], inplace=True)

This cell reorders the columns to bring 'city' to the front. This is a good way to organize your DataFrame for easier viewing. I have corrected a small bug here that was causing the city column to be duplicated.

In [57]:
cols = ['city'] + [col for col in df.columns if col != 'city']

In [58]:
df = df[cols]

We now load a different dataset that contains missing values to demonstrate how to handle them.

In [61]:
sm = pd.read_csv('missing_values_data.csv')

Sorting the new DataFrame by 'Name' to better visualize the data and identify any patterns.

In [66]:
sm.sort_values('Name')

`reset_index()` is used again to clean up the index after sorting. Without `drop=True`, the old index is kept as a new column named 'index'.

In [67]:
sm.reset_index()

A new column 'rank' is added to the DataFrame, which ranks the 'Age' column in descending order. This is a great way to see how values compare to each other.

In [89]:
sm['rank'] = sm['Age'].rank(ascending=False,method = 'dense')