# Day 4 - File Input and Output in Python

## Introduction:
Welcome to Day 4 of our 100-Day Data Science journey! Today, we'll dive into a really important skill in programming: File Input and Output (I/O). Whether you're storing data, reading configurations, or processing large datasets, understanding how to read from and write to files is crucial.
In this post, we'll start with handling text files in Python. Then, we'll explore a practical example of processing a CSV file containing public data about U.S. states. We'll also briefly touch on different types of files (text, binary, CSV) to give you a broader perspective.

## Why is File I/O Important?
File I/O is an integral part of programming because it allows you to persist data between program runs, share data with other programs, and handle large datasets that might not fit into memory simultaneously. Being proficient in File I/O enables you to build more complex, data-driven applications.
Real-world scenarios where file I/O is critical include:
- Logging Data: Storing logs from applications for troubleshooting or analysis.
- Configuration Management: Reading and writing configuration files that control how software behaves.
- Data Analysis Pipelines: Handling large datasets efficiently by reading from and writing to files.

## Basic Theory: Understanding File I/O in Python
Python provides built-in functions for working with files, allowing you to easily open, read, write, and close files. The most commonly used functions include `open()`, `read()`, `write()`, and `close()`.

### Opening a File
The `open()` function is used to open a file in various modes:
- 'r' - Read mode (default)
- 'w' - Write mode (overwrites the file)
- 'a' - Append mode (adds to the end of the file)
- 'b' - Binary mode (used for non-text files)

Understanding these file modes is crucial as they determine how a file is accessed:
- Read Mode ('r'): Opens the file for reading. If the file doesn't exist, an error is raised.
- Write Mode ('w'): Opens the file for writing. If the file exists, it will be overwritten; if not, a new file will be created.
- Append Mode ('a'): Opens the file for writing but appends data to the end instead of overwriting it.

### Closing a File
It's important to close files after you use them to free up system resources. This is done using the `close()` method or automatically with a `with` statement, which ensures that the file is closed when the block of code is exited.

## Tutorial: Reading and Writing Text Files
In this tutorial, we'll create a simple Python script to read from and write to a text file. This basic operation is the foundation for working with files in Python.

In [2]:
# Reading from a text file
with open('example.txt', 'r') as file:
    content = file.read()
    print('File Content:')
    print(content)

File Content:
Hello World!


In [3]:
# Writing to a text file
with open('example.txt', 'w') as file:
    file.write('Hello, World!\n')
    file.write('This is a new line of text.')

## Real-Life Example: Processing a CSV File of Public Data
CSV (Comma-Separated Values) files are widely used for storing tabular data, and Python's csv module makes it easy to work with them. In this real-life example, we'll process a CSV file containing information about U.S. states.

In [5]:
import csv
# Reading a CSV file
with open('us_states.csv', 'r') as csvfile:
    csvreader = csv.reader(csvfile)
    header = next(csvreader)  # Read the header row
    print('Header:', header)
    for row in csvreader:
        print(row)
# Writing to a CSV file
with open('processed_states.csv', 'w', newline='') as csvfile:
    csvwriter = csv.writer(csvfile)
    csvwriter.writerow(['State', 'Abbreviation', 'Population'])
    csvwriter.writerow(['California', 'CA', '39538223'])
    csvwriter.writerow(['Texas', 'TX', '29145505'])

Header: ['State', 'Abbreviation', 'Population']
['Florida', 'FL', '21538187']
['New York', 'NY', '20201249']
['Pennsylvania', 'PA', '13002700']


## I/O with Pandas: Simplifying Data Processing
While the csv module is great for basic file operations, Pandas offers a much more powerful and flexible way to handle data in CSV files (and other formats like Excel, JSON, etc.).


In [6]:
!pip install pandas



In [7]:
### Reading a CSV File with Pandas
import pandas as pd
# Reading a CSV file into a DataFrame
df = pd.read_csv('us_states.csv')
print('DataFrame Head:')
print(df.head())


DataFrame Head:
          State Abbreviation  Population
0       Florida           FL    21538187
1      New York           NY    20201249
2  Pennsylvania           PA    13002700


In [8]:
### Writing a CSV File with Pandas
# Writing the DataFrame to a new CSV file
df.to_csv('processed_states_pandas.csv', index=False)
### Processing Data with Pandas

# Filter states with population over 10 million
high_pop_states = df[df['Population'] > 10000000]
print('States with Population Over 10 Million:')
print(high_pop_states)


States with Population Over 10 Million:
          State Abbreviation  Population
0       Florida           FL    21538187
1      New York           NY    20201249
2  Pennsylvania           PA    13002700


## Conclusion:
In today's post, we explored the essentials of File I/O in Python. We started with basic operations like reading and writing text files and then applied these concepts to a real-life example of processing a CSV file containing public data.
Understanding how to work with files is fundamental for any data-driven application. Make sure to practice these concepts, as they will be invaluable in your journey towards mastering Python and Data Science.