<a href="https://colab.research.google.com/github/misrori/coding2024/blob/main/pandas_tricks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Import the Pandas library
import pandas as pd

# Create a sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 22, 35, 28],
    'Gender': ['Female', 'Male', 'Male', 'Male', 'Female'],
    'Salary': [50000, 60000, 45000, 70000, 55000]
}

df = pd.DataFrame(data)




1. `df.info()`: This function provides a concise summary of the DataFrame's basic information, including the data types, non-null counts, and memory usage.

   Example:
   ```python
   df.info()
   ```

2. `df.shape`: Returns a tuple representing the dimensions of the DataFrame, i.e., the number of rows and columns.

   Example:
   ```python
   df.shape  # (5, 4) - 5 rows and 4 columns
   ```

3. `df.loc`: Allows you to access a group of rows and columns by labels.

   Example:
   ```python
   df.loc[0]  # Access the first row
   df.loc[:, 'Name']  # Access the 'Name' column for all rows
   df.loc[0, 'Age']  # Access the 'Age' value in the first row
   ```

4. `df.iloc`: Similar to `df.loc`, but uses integer-based indexing instead of labels.

   Example:
   ```python
   df.iloc[0]  # Access the first row
   df.iloc[:, 1]  # Access the second column for all rows
   ```

5. `df.columns`: Returns an Index object containing the column labels of the DataFrame.

   Example:
   ```python
   df.columns
   ```

6. `df.set_index()`: Changes the DataFrame's index to the specified column.

   Example:
   ```python
   df.set_index('Name', inplace=True)
   ```

7. `df.sort_index()`: Sorts the DataFrame's index.

   Example:
   ```python
   df.sort_index(ascending=False)
   ```

8. `df[~condition]`: Filters the DataFrame rows based on the inverse of the condition.

   Example:
   ```python
   df[~(df['Age'] > 25)]  # Rows where Age is not greater than 25
   ```

9. `df['col'].isin()`: Checks if values in a column are present in the provided list or Series.

   Example:
   ```python
   df[df['Name'].isin(['Alice', 'Bob'])]
   ```

10. `df['col'].str.contains('hello', na=False)`: Checks if values in a column contain a specific substring.

    Example:
    ```python
    df[df['Name'].str.contains('A', na=False)]
    ```

11. Rename all columns: You can use the `.rename()` method to rename all columns.

    Example:
    ```python
    df.columns = df.columns.str.upper()
    ```

12. Rename two columns with `inplace`: You can use the `.rename()` method with a dictionary to rename specific columns in-place.

    Example:
    ```python
    df.rename(columns={'Name': 'Full Name', 'Age': 'Years Old'}, inplace=True)
    ```

13. Change one row data with `.loc`: You can modify a specific row's data using `.loc`.

    Example:
    ```python
    df.loc[0, 'Age'] = 26
    ```

14. Change one row data with few columns `.loc`: You can update specific columns of a row using `.loc`.

    Example:
    ```python
    df.loc[0, ['Age', 'Gender']] = [26, 'Female']
    ```

15. `df['col'].apply()`: Applies a function to each element of a column.

    Example:
    ```python
    df['Salary'] = df['Salary'].apply(lambda x: x * 1.1)  # Increase salary by 10%
    ```

16. `.apply` with lambda function and changing axis to columns: You can apply a function to columns using `.apply`.

    Example:
    ```python
    df[['Age','Salary']].apply(lambda x: x.max() - x.min(), axis='rows')   # Calculate the range for each column
    ```

17. `.applymap()`: Applies a function element-wise to the entire DataFrame.

    Example:
    ```python
    df.applymap(lambda x: x.upper() if isinstance(x, str) else x)
    ```

18. `.map()`: Applies a function to each element of a Series.

    Example:
    ```python
    df['Gender'] = df['Gender'].map({'Male': 0, 'Female': 1})
    ```

19. `.replace()`: Replaces specific values in a column with other values.

    Example:
    ```python
    df['Gender'].replace({'Male': 'M', 'Female': 'F'}, inplace=True)
    ```

20. `str.split('', expand=True)`: Splits a string column into multiple columns based on a delimiter.

    Example:
    ```python
    df['Name'].str.split(' ', expand=True)
    ```

21. `df.drop()`: Removes columns from the DataFrame.

    Example:
    ```python
    df.drop(columns=['Salary'], inplace=True)
    ```

22. `df.append(ignore_index=True)`: Appends rows from another DataFrame to the current DataFrame while ignoring the index.

    Example:
    ```python
    df2 = pd.DataFrame({'Name': ['Frank'], 'Age': [40], 'Gender': ['Male']})
    df = df.append(df2, ignore_index=True)
    ```

23. `df.drop(rows with index)`: Removes rows from the DataFrame based on their index.

    Example:
    ```python
    df.drop([0, 2], inplace=True)  # Removes the first and third rows
    ```

24. `df['col'].sort_values(inplace=True)`: Sorts a specific column in the DataFrame in-place.

    Example:
    ```python
    df['Age'].sort_values(inplace=True)
    ```

25. `df.sort_index()`: Sorts the DataFrame by its index.

    Example:
    ```python
    df.sort_index(ascending=False, inplace=True)
    ```

26. `nlargest()` and `nsmallest()`: Returns the n largest or n smallest values from a specific column.

    Example:
    ```python
    df.nlargest(3, 'Salary')  # Returns the 3 largest salaries
    df.nsmallest(2, 'Age')   # Returns the 2 smallest ages
    ```



# Groupby aggragate

1. **Filter with One Criterion**:
   You can filter a DataFrame based on a single condition using boolean indexing.

   Example:
   ```python
   single_filter = df[df['Age'] > 25]
   ```

2. **Filter with Two Criteria**:
   To filter with two conditions, you can use the `&` (and) or `|` (or) operators to combine the conditions.

   Example:
   ```python
   double_filter = df[(df['Age'] > 25) & (df['Gender'] == 'Male')]
   ```

3. **Filter with SQL Query**:
   You can use the `.query()` method to filter a DataFrame using a SQL-like query.

   Example:
   ```python
   sql_filtered = df.query("Age > 25 and Gender == 'Male'")
   ```

4. **Rename Columns**:
   You can use the `.rename()` method to rename columns.

   Example:
   ```python
   df.rename(columns={'Name': 'Full Name', 'Age': 'Years Old'}, inplace=True)
   ```

5. **Group By and Count the Number of Observations**:
   Use the `.groupby()` and `.size()` methods to group by a column and count the number of observations in each group.

   Example:
   ```python
   observations_count = df.groupby('Gender').size()
   ```

6. **Group By and Calculate One Variable**:
   Group by a column and calculate statistics for another column within each group using methods like `.groupby()` and `.mean()`.

   Example:
   ```python
   avg_salary_by_gender = df.groupby('Gender')['Salary'].mean()
   ```

7. **Group By and Calculate Multiple Variables**:
   Use `.groupby()` and `.agg()` to calculate multiple statistics for multiple columns within groups.

   Example:
   ```python
   group_stats = df.groupby('Gender').agg({'Age': 'mean', 'Salary': 'sum'})
   ```

8. **Group By Based on Multiple Variables and Calculate Multiple Measures**:
   You can group by multiple columns and calculate multiple measures for each group using `.groupby()` and `.agg()`.

   Example:
   ```python
   multi_group_stats = df.groupby(['Gender', 'Age']).agg({'Salary': 'mean', 'Name': 'count'})
   ```

9. **Show `value_counts()` and Convert the Result to DataFrame**:
    Use `value_counts()` to count the occurrences of unique values in a column and convert the result to a DataFrame.

    Example:
    ```python
    gender_counts = df['Gender'].value_counts().reset_index()
    gender_counts.columns = ['Gender', 'Count']
    ```

10. **Sort Values with One and Multiple Variables**:
    You can use the `.sort_values()` method to sort the DataFrame by one or more columns.

    Example (Single Variable):
    ```python
    sorted_by_salary = df.sort_values(by='Salary', ascending=False)
    ```

    Example (Multiple Variables):
    ```python
    sorted_by_gender_age = df.sort_values(by=['Gender', 'Age'], ascending=[True, False])
    ```

