### Pandas
    You and your team are analysts at a mobile phone company. Your manager asks you to analyze the market trends of             different mobile phone models to understand their performance based on various features

#### Reading CSV Files
    What's the first thing we should do when we receive a dataset?ü§îü§î
    The first step is to read the CSV file and load it into a DataFrame to inspect the data. Let's read the file and take a     look at the first few rows.

In [107]:
import pandas as pd

# Reading the CSV file
file_path = 'mobile_data_analysis.csv'
data = pd.read_csv(file_path)

# Display the first few rows of the dataset
data.head()
# Display the all columns of the dataset
data.columns

Index(['Name', 'Rating', 'Spec_score', 'No_of_sim', 'Ram', 'Battery',
       'Display', 'Camera', 'External_Memory', 'Android_version', 'Price',
       'company', 'Inbuilt_memory', 'fast_charging', 'Screen_resolution',
       'Processor', 'Processor_name'],
      dtype='object')

In [None]:
data.head()

In [None]:
# Display the all columns of the dataset
data.columns

    Our dataset contains various columns like 'Name', 'Rating', 'Spec_score', 'No_of_sim', 'Ram', 'Battery', 'Display',         'Camera', 'External_Memory', 'Android_version', 'Price', 'company', 'Inbuilt_memory', 'fast_charging',                     'Screen_resolution', 'Processor',and 'Processor_name'. Each row represents a different mobile phone

#### Loading Python Data Objects
    You know, there's a common misconceptionüò¶ that data always comes in csv files. But in reality, data can be stored in       various formats, and that's why we have different functions to read them. It's not just about csv ‚Äì we can handle           excel files, SQL databases, JSON files, and more. The flexibility to read data from multiple sources is really             important in data analysisü•≥

    A dataset can be loaded from various data sources using relevant Pandas constructs (functions) as mentioned below:
    CSV file - read_csv() function
    JSON file - read_json() function
    Excel file - read_excel() function
    Database table - read_sql() function
    All the above functions return a dataframe object and most of these functions have a parameter called 'chunksize'.
    e.g. to load a csv data file (mobile phone price prediction.csv) you can use the above code
    data = pd.read_csv(file_path)

In [None]:
# Reading with specific columns
data_selected_columns = pd.read_csv(file_path, usecols=['Battery', 'Ram', 'Price'])
print("\nReading Specific Columns:\n", data_selected_columns.head())

# Reading with index column
data_with_index = pd.read_csv(file_path, index_col='Name')
print("\nReading with Index Column:\n", data_with_index.head())

# Reading with missing values handling
data_missing_values = pd.read_csv(file_path, na_values=['NA', 'n/a', ''])
print("\nReading with Missing Values Handling:\n", data_missing_values.head())

#### Understanding the Dataü§ì
    You look at the data and notice columns like 'Name', 'Rating', 'Spec_score', 'Ram', 'Battery', and many others.
    Question: How can we understand the structure and types of data we're dealing with?
    We can use methods like info() and describe() to get a better understanding of the data.

In [None]:
# Displaying information about the DataFrame
print(data.info())

# Descriptive statistics of numerical columns
print(data.describe())


#### Performing Arithmetic Operations - Crunching Numbers
    Our next adventure involves performing arithmetic operations. Suppose we want to understand how doubling the price         impacts our dataset. Let‚Äôs create a new column for this.

In [None]:
# Creating a new column with doubled Price
data['Double_Price'] = data['Price'] * 2

# Display the first few rows to see the new column
print(data[['Price', 'Double_Price']].head())


    Here, we multiply the 'Price' column by 2 and store it in a new column 'Double_Price'. This simple operation opens the    door to more complex analyses.

    Can we do other arithmetic operations as well?"ü´§

    Absolutely!üòä We can add, subtract, multiply, or divide columns. Let's create a column showing the sum of 'Ram' and           'Battery'.

In [None]:
# Creating a new column with the sum of Ram and Battery
data['Ram_Battery_Sum'] = data['Ram'] + data['Battery']

# Display the first few rows to see the new column
print(data[['Ram', 'Battery', 'Ram_Battery_Sum']].head())


#### Focusing the Lens - Selecting Data
       To find valuable insights, we often need to focus on specific parts of our data. For example, let‚Äôs select only the        Name', 'Price', and 'Ram' columns.

In [None]:
# Selecting specific columns
selected_columns = data[['Name', 'Price', 'Ram']]

# Display the selected columns
print(selected_columns.head())


     This is cool!üòÅ How about selecting rows based on multiple conditions
     
     This step helps us concentrate on the most relevant data for our analysis. Next, we‚Äôll narrow down our rows to high-        priced phones
     
     Let‚Äôs select phones with 'Price' greater than 20000 and 'Rating' greater than 4 ."
      

In [None]:
# Selecting rows based on multiple conditions
high_price_high_rating_phones = data[(data['Price'] > 200) & (data['Rating'] > 4.0)]

# Display the selected rows
high_price_high_rating_phones.head()


#### Rounding Numbers - Making Data Neat
    Clean data is crucial for clear insights. Sometimes, this means rounding numbers to make them more readable. Let‚Äôs         round the 'Price' column to the nearest integer.
    We can round numbers using the round function. Let's round the 'Price' column to the nearest integer

In [None]:
# Rounding the Price column
data['Rounded_Price'] = data['Price'].round()

# Display the first few rows to see the rounded values
print(data[['Price', 'Rounded_Price']].head())


    Rounding the 'Price' column makes our data cleaner and easier to interpret. We can also round other columns to specific     decimal places as needed
    
    This rounds the 'Price' column to the nearest integer and stores it in a new column 'Rounded_Price'. We can also round     to a specific number of decimal places. For instance, rounding the 'Rating' to one decimal place.

In [None]:
# Rounding the Rating column to one decimal place
data['Rounded_Rating'] = data['Rating'].round(1)

# Display the first few rows to see the rounded values
print(data[['Rating', 'Rounded_Rating']].head())


#### Data Aggregation - Summarizing Data
    To gain quick insights, we can summarize our data through aggregation. For instance, let‚Äôs calculate the average price     of phones grouped by their company.

In [None]:
# Aggregating data: Calculating the mean Price by company
mean_price_by_company = data.groupby('company')['Price'].mean()

# Display the aggregated data
print(mean_price_by_company)


    Grouping by 'company' and calculating the mean 'Price' gives us an overview of how prices vary across different brands.
    Exciting !!! Now Let's check which companies phones has the highest rating phones .

In [None]:
# Aggregating data: Calculating Which 
average_ratings = data.groupby('company')['Rating'].mean().sort_values(ascending=False)

# Display the aggregated data
average_ratings

#### Cleaning Data - Data Munging Techniques
    Data isn't always perfect. It often contains missing values or duplicates that need to be addressed. Let‚Äôs start by         identifying missing values.

In [None]:
# Checking for missing values
missing_values = data.isnull().sum()
print("Missing Values:\n", missing_values)


    We see the count of missing values for each column. Now, let‚Äôs fill missing values in 'Android_version' with the mean

In [None]:
# Filling missing values in Battery with the mean
data['Android_version'].fillna(data['Android_version'].mean(), inplace=True)

# Display the first few rows to see changes
print(data.head())


    We can also remove duplicate rows to clean our data.

In [None]:
# Removing duplicates
data_no_duplicates = data.drop_duplicates()

# Display the first few rows to see changes
print(data_no_duplicates.head())


####  Saving Our Progress - A Wise Moveüòå
    As we navigate through our analysis, it‚Äôs crucial to save our progress. This way, we can pick up right where we left       off. Let‚Äôs save our DataFrame to both a CSV file and a pickle file.

In [None]:
# Save DataFrame to a CSV file
data.to_csv('saved_mobile_phone_data.csv', index=False)
print("Data saved to CSV file.")


#### Visualizing Data - Bringing Insights to Life
    Visualization is a powerful tool to understand data trends and patterns. Let‚Äôs plot some graphs to visualize our           findings. First, we need to import the necessary library.

In [None]:
import matplotlib.pyplot as plt

# Plotting the distribution of phone prices
plt.figure(figsize=(10, 6))
plt.hist(data['Price'], bins=30, edgecolor='k', alpha=0.7)
plt.title('Distribution of Phone Prices')
plt.xlabel('Price')
plt.ylabel('Frequency')
plt.show()


    This histogram shows the distribution of phone prices, helping us understand the range and frequency of different price     points.üßêüßê

    Next, let‚Äôs plot the average price by company using a bar chart.

In [None]:
# Plotting average price by company
mean_price_by_company.plot(kind='bar', figsize=(12, 8), color='skyblue')
plt.title('Average Price by Company')
plt.xlabel('Company')
plt.ylabel('Average Price')
plt.show()


#### Conclusion: Insights and Learnings

    Through this journey, we‚Äôve explored various techniques to read, clean, manipulate, and visualize data using pandas.       We‚Äôve seen how to perform arithmetic operations, select and filter data, handle missing values, aggregate data, merge       datasets, and visualize trends.ü§ó

    Each step has brought us closer to understanding our mobile phone dataset, enabling us to extract meaningful insights       and make informed decisions. This story-driven approach not only teaches us the technical skills but also emphasizes       the importance of clear, clean, and insightful data analysis.ü´°