### **4. Data Input and Output**

Data input and output (I/O) operations are crucial for reading data from various formats and writing data to different formats. Pandas provides robust methods for handling these operations, allowing you to work with CSV, Excel, and JSON files effectively.

#### **Reading Data**

1. **CSV Files: `read_csv()`**

    - CSV (Comma-Separated Values) files are one of the most widely used formats for tabular data. They are simple text files where each line represents a record, and each record's fields are separated by a comma (or other delimiters). read_csv() is a powerful function that allows pandas to read such files into a DataFrame. This function offers many parameters to handle different file structures, such as custom delimiters, header rows, and missing values.

- Advanced Features:

  - `Delimiters:` By default, read_csv() uses commas as delimiters, but you can specify other delimiters using the sep parameter (e.g., sep='\t' for tab-separated files).
  - `Handling Headers:` You can skip headers or use a specific row as headers with the header parameter. For example, header=None treats the first row as data rather than headers.
  - `Handling Missing Data:` The na_values parameter allows you to specify additional strings to recognize as missing values.

   **Example:**

   ```python
   import pandas as pd

   # Reading a CSV file
   data = {
       'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
       'Age': [25, 30, 35, 28],
       'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
   }
   df = pd.DataFrame(data)
   df.to_csv('people.csv', index=False)

   # Reading the CSV file back into a DataFrame
   df_read = pd.read_csv('people.csv')
   print(df_read)


2. **Excel Files:` read_excel()`**
   - JExcel files are popular for storing and manipulating data in spreadsheets. Excel supports multiple sheets, various data types, and complex formatting. The read_excel() function reads Excel files into pandas DataFrames, allowing you to specify sheet names, handle multiple sheets, and parse dates.

- Advanced Features:

  - `Sheet Handling:` You can specify which sheet to read using the sheet_name parameter. It can be an integer (sheet index), a string (sheet name), or a list (multiple sheets).
  - `Date Parsing: `The parse_dates parameter can automatically parse date columns, which simplifies handling date and time data.
  - `Data Type Conversion:` The dtype parameter allows specifying data types for columns to ensure data consistency.

Example:

In [None]:
import pandas as pd

# Creating and saving an Excel file
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
}
df = pd.DataFrame(data)
df.to_excel('people.xlsx', index=False, sheet_name='Sheet1')

# Reading the Excel file back into a DataFrame
df_read = pd.read_excel('people.xlsx', sheet_name='Sheet1')
print(df_read)


3. JSON Files: `read_json()`
 - JSON (JavaScript Object Notation) is a lightweight format for data interchange, commonly used in web applications and APIs. JSON data is hierarchical and can include nested structures. The read_json() function reads JSON files into pandas DataFrames, supporting various JSON formats and structures.

- Advanced Features:

  - `Orientations: `The orient parameter specifies the JSON structure. Common options include records (list of dictionaries), split (dictionary with separate keys for index, columns, and data), and index (data organized by index).
  - `Handling Nested JSON:` Pandas can handle nested JSON data by normalizing it. You might need additional processing to flatten nested structures.
Example:

In [None]:
import pandas as pd

# Creating and saving a JSON file
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
}
df = pd.DataFrame(data)
df.to_json('people.json', orient='records', lines=True)

# Reading the JSON file back into a DataFrame
df_read = pd.read_json('people.json', orient='records', lines=True)
print(df_read)


## Writing Data

1. To CSV: `to_csv()`

  - Writing data to CSV files is essential for exporting and sharing data. The to_csv() function allows exporting DataFrames to CSV files with various options for formatting and handling.

- Advanced Features:

  - `Custom Delimiters:` You can specify custom delimiters using the sep parameter, allowing for various file formats.
  - `Index Handling:` The index parameter controls whether the DataFrame index is included in the output file. Setting it to False excludes the index.
  - `Compression:` The compression parameter supports various compression formats like gzip, bz2, and zip to save disk space.

Example:

In [None]:
import pandas as pd

# DataFrame to be written to CSV
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
}
df = pd.DataFrame(data)

# Writing the DataFrame to a CSV file
df.to_csv('people_output.csv', index=False)


2. To Excel: `to_excel()`

- Exporting data to Excel is useful for generating reports and interacting with users who use spreadsheets. The to_excel() function provides options for saving DataFrames to Excel files with multiple sheets, custom formatting, and additional options.

- Advanced Features:

  - `Multiple Sheets:` You can write multiple DataFrames to different sheets within the same Excel file using the ExcelWriter context manager.
  - `Formatting:` You can use the xlsxwriter engine to apply formatting to the Excel file.
Example:

In [None]:
import pandas as pd

# DataFrame to be written to Excel
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
}
df = pd.DataFrame(data)

# Writing the DataFrame to an Excel file
df.to_excel('people_output.xlsx', index=False, sheet_name='Sheet1')


3. To JSON: `to_json()`

- Writing data to JSON format is useful for interoperability with web applications and APIs. The to_json() function allows exporting DataFrames to JSON files with different structures and options for encoding.

- Advanced Features:

  - `Orientations:` The orient parameter specifies the structure of the JSON file. Options include records, split, and index, each representing the data in different formats.
  - `Indentation:` The indent parameter controls the formatting of the JSON file, making it more readable.

Example:

In [None]:
import pandas as pd

# DataFrame to be written to JSON
data = {
    'Name': ['Bhagath', 'Bharath', 'Monika', 'Padhmavathi'],
    'Age': [25, 30, 35, 28],
    'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Chickkaballapur']
}
df = pd.DataFrame(data)

# Writing the DataFrame to a JSON file
df.to_json('people_output.json', orient='records', lines=True)
