<a href="https://colab.research.google.com/github/Shyam456-IIIT/C_Program/blob/main/AI_Notes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Introduction**

    Imagine you have a story or a list of names written in a `.txt` file (like in Notepad). Your Python program can't see it unless you explicitly tell it to open the file and read it. Reading text files is the most basic way to get data into your program.

### **Definition**
  **Text File (`.txt`):** A simple file that contains only plain text without any special formatting (like bold, italics, etc.).
  
  **Reading a File:** The process where Python opens a file, takes its content, and makes it available for your program to use (e.g., store it in a variable).
  
  **`open()` Function:** The key Python function used to open a file. It creates a connection between your program and the file on your computer.


**The `with` Statement (The Safe Way to Open Files)**
    We use `with` to automatically and safely close the file after we are done. This is important to avoid errors and data corruption.  



**Syntax:**

```python
with open('filename.txt', 'r') as file:
    content = file.read()
    # Do something with content
# File is automatically closed here
```


#### **Examples with Explanations**

**Example 1: Reading the Entire File at Once**

Let's say we have a file called `greeting.txt` with the text: `Hello, E1 Students! Welcome to Python!`


```python
# Step 1: Open the file in read mode ('r')
with open('greeting.txt', 'r') as file:
    # Step 2: Read the entire content into a variable
    text = file.read()

# Step 3: Print the content
print(text)
```


**Explanation:**
1.  `open('greeting.txt', 'r')`: Opens `greeting.txt`. The `'r'` stands for "read mode".
2.  `as file`: The opened file is now accessible via the variable `file`.
3.  `file.read()`: This method reads **everything** in the file and stores it in the variable `text`.
4.  `print(text)`: Outputs the content.


**Output:**
```
Hello, E1 Students! Welcome to Python!
```


**Example 2: Reading Line by Line**

Let's say we have `courses.txt`:
```
AI
Python
Matplotlib
```


```python
with open('courses.txt', 'r') as file:
    for line in file:  # Loops through each line in the file
        print(f"Course: {line.strip()}")  # .strip() removes the invisible '\n' newline character
```


**Output:**
```
Course: AI
COurse: Python
COurse: Matplotlib
```


**Example 3: Read Lines into a List**

Assume `data.txt` contains:
```
Hello Students
Welcome to Python Class
```

```python
with open("data.txt", "r") as f:
    lines = f.readlines()

print(lines)
```

**Output:**

```python
['Hello Students\n', 'Welcome to Python Class\n']
```

**Example 4: Counting Words in a Text File**

```python
with open("data.txt", "r") as f:
    text = f.read()

words = text.split()
print("Total words:", len(words))
```

**Output:**

```
Total words: 5
```


### **Introduction**
What if your data is more complex, like a spreadsheet with rows and columns? A `CSV` (Comma-Separated Values) file is exactly that. It's like a simple Excel sheet. While you *could* read it as a text file, it would be messy. **Pandas** is a superhero library in Python that makes working with tabular data (like CSV files) incredibly easy.


### **Definition**
*   **CSV File:** A file where each line represents a row of data, and the values in each row are separated by commas. CSV stands for **Comma-Separated Values** — a simple text format for storing tabular data.
    *   Example Row: `Alice,25,London`
*   **Pandas:** A powerful and popular Python library for data manipulation and analysis.
*   **DataFrame:** The most important pandas object. It's a 2-dimensional table with rows and columns, just like a spreadsheet in Excel or Google Sheets.


#### **First, You Must Import Pandas**
Before you can use it, you need to tell Python you need it. The common shortcut is `pd`.

```python
import pandas as pd
```


#### **Examples with Explanations**

**Example 1: Reading a CSV File into a DataFrame**

Let's say we have a file `employees.csv`:
```csv
Name,Age,Department,Salary
Alice,25,Engineering,50000
Bob,30,Marketing,45000
Charlie,35,Sales,40000
```

```python
import pandas as pd

# Step 1: Read the CSV file
df = pd.read_csv('employees.csv')

# Step 2: Display the first few rows to see the data
print(df.head())
```

**Explanation:**
1.  `import pandas as pd`: Imports the pandas library.
2.  `pd.read_csv('employees.csv')`: This single function does all the hard work. It reads the CSV file, figures out the columns, and creates a DataFrame.
3.  `df = ...`: We store this DataFrame in a variable named `df` (a common convention for DataFrames).
4.  `df.head()`: Displays the first 5 rows of the DataFrame. Perfect for a quick look.

**Output:**
```
      Name  Age    Department  Salary
0    Alice   25  Engineering   50000
1      Bob   30    Marketing   45000
2  Charlie   35        Sales   40000
```
*Notice how pandas automatically adds an index (0, 1, 2) on the left!*


**Example 2: Basic Data Manipulation**

Once the data is in a DataFrame, we can do many things easily.

```python
import pandas as pd
df = pd.read_csv('employees.csv')

# 1. Get a single column (e.g., just the names)
names = df['Name']
print("Names Column:\n", names)

# 2. Get basic statistics for numerical columns
print("\nBasic Statistics:\n", df.describe())

# 3. Filter data (e.g., find employees in Engineering)
engineering_team = df[df['Department'] == 'Engineering']
print("\nEngineering Team:\n", engineering_team)

# 4. Add a new column (e.g., a bonus of 5% of salary)
df['Bonus'] = df['Salary'] * 0.05
print("\nDataFrame with Bonus:\n", df)
```

**Explanation:**
1.  `df['Name']`: Uses square brackets to select a single column.
2.  `df.describe()`: Provides a summary of count, mean, standard deviation, etc., for numerical columns (Age, Salary).
3.  `df[df['Department'] == 'Engineering']`: This is a **filter**. It looks inside the `Department` column and only keeps the rows where the value is exactly `'Engineering'`.
4.  `df['Bonus'] = ...`: Creates a brand new column called `'Bonus'` and calculates its value for each row.



Example file: `students.csv`

```
Name,Branch,Marks
Anita,CSE,85
Ravi,EEE,78
Sneha,ECE,90
```
**Example 3: Accessing Columns and Rows**

```python
# Access single column
print(df["Name"])

# Access multiple columns
print(df[["Name", "Marks"]])

# Access rows by index
print(df.iloc[1])   # 2nd row
```
**Example 4: Filtering Data**

```python
# Students who scored above 80
high_scorers = df[df["Marks"] > 80]
print(high_scorers)
```

**Output:**

```
    Name Branch  Marks
0   Anita    CSE     85
2   Sneha    ECE     90
```
**Example 5: Adding and Modifying Columns**

```python
# Add Grade based on marks
df["Grade"] = ["A" if m >= 80 else "B" for m in df["Marks"]]
print(df)
```

**Output:**

```
    Name Branch  Marks Grade
0   Anita    CSE     85     A
1    Ravi    EEE     78     B
2   Sneha    ECE     90     A
```

**Example 6: Sorting and Aggregation**

```python
# Sort by marks
sorted_df = df.sort_values(by="Marks", ascending=False)
print(sorted_df)

# Average marks
avg = df["Marks"].mean()
print("Average Marks:", avg)
```

**Output:**

```
Average Marks: 84.33
```

**Example 7: Writing Data Back to CSV**

```python
df.to_csv("updated_students.csv", index=False)
print("File saved successfully!")
```