# Python Data Serialization and I/O
This notebook introduces the main ways to save, load, and exchange data in Python. Each section is concise, practical, and beginner-friendly.

## What is Pickle?
Pickle is a Python module for saving (serializing) and loading (deserializing) Python objects. Useful for saving models, settings, or any Python data structure to disk.

In [None]:
import pickle

## Create a Python dictionary
This cell creates a simple dictionary to demonstrate serialization.

In [None]:
my_data = {'a': 1, 'b': 2}

## Save a Python object to a file using Pickle
This cell shows how to serialize a Python object and write it to a file.

In [None]:
with open('data.pkl', 'wb') as f:
    pickle.dump(my_data, f)

## Load a Python object from a Pickle file
This cell demonstrates how to load a previously saved object from disk.

In [None]:
with open('data.pkl', 'rb') as f:
    loaded = pickle.load(f)
print(loaded)

**Use case:** Save trained ML models, user settings, or session data for later use.

## What is CSV?
CSV (Comma-Separated Values) is a universal format for tabular data, like spreadsheets and databases.

In [None]:
import csv

## Prepare data for CSV
This cell prepares a list of rows for writing to a CSV file.

In [None]:
rows = [['name', 'age'], ['Alice', 30], ['Bob', 25]]

## Write data to a CSV file
This cell shows how to write tabular data to a CSV file.

In [None]:
with open('people.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(rows)

## Read data from a CSV file
This cell demonstrates how to read tabular data from a CSV file.

In [None]:
with open('people.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

**Use case:** Import/export data between Excel, databases, and Python.

## What is JSON?
JSON (JavaScript Object Notation) is a widely used, human-readable format for data exchange between languages and systems.

In [None]:
import json

## Serialize a Python object to JSON and save to file
This cell shows how to save a Python dictionary as JSON.

In [None]:
person = {'name': 'Alice', 'age': 30, 'city': 'New York'}
with open('person.json', 'w') as f:
    json.dump(person, f)

## Load data from a JSON file
This cell shows how to read and parse JSON data from a file.

In [None]:
with open('person.json', 'r') as f:
    loaded_person = json.load(f)
print(loaded_person)

**Use case:** Web APIs, config files, and cross-language data exchange.

## What is XML?
XML is used for structured, hierarchical data exchange, especially in web services and config files.

In [None]:
import xml.etree.ElementTree as ET

## Example XML data as a string
This cell defines a simple XML string for demonstration.

In [None]:
xml_data = '<root><person><name>Alice</name></person></root>'

## Parse XML data from a string
This cell shows how to parse XML and access its elements.

In [None]:
root = ET.fromstring(xml_data)
for person in root.findall('person'):
    print(person.find('name').text)

**Use case:** Data exchange with legacy systems, config files, or APIs.

## What is Excel?
Excel files are common in business for storing and analyzing tabular data. Use pandas for easy reading/writing.

In [None]:
import pandas as pd

## Create a DataFrame
This cell creates a pandas DataFrame for demonstration.

In [None]:
data = {'name': ['Alice', 'Bob'], 'age': [30, 25]}
df = pd.DataFrame(data)

## Write a DataFrame to an Excel file
This cell shows how to save a DataFrame to an Excel file.

In [None]:
df.to_excel('people.xlsx', index=False)

## Read a DataFrame from an Excel file
This cell demonstrates how to load data from an Excel file.

In [None]:
read_df = pd.read_excel('people.xlsx')
print(read_df)

**Use case:** Automate report generation, data analysis, or business workflows.

## What are Plain Text Files?
Plain text files are the simplest way to store and share data, logs, or notes.

## Write to a text file
This cell shows how to write text to a file.

In [None]:
with open('notes.txt', 'w') as f:
    f.write('This is a line of text.\nAnother line.')

## Read from a text file
This cell demonstrates how to read all lines from a text file.

In [None]:
with open('notes.txt', 'r') as f:
    lines = f.readlines()
print(lines)

**Use case:** Logging, configuration, and simple data storage.

## What are ZIP Files?
ZIP files are used to compress and bundle multiple files for storage or sharing.

In [None]:
import zipfile

## Create a ZIP file and add files to it
This cell shows how to create a ZIP archive and add files.

In [None]:
with zipfile.ZipFile('archive.zip', 'w') as zipf:
    zipf.write('notes.txt')
    zipf.write('person.json')

## Extract files from a ZIP archive
This cell shows how to extract all files from a ZIP archive.

In [None]:
with zipfile.ZipFile('archive.zip', 'r') as zipf:
    zipf.extractall('extracted_files')

**Use case:** Data backup, sharing datasets, and packaging projects.

## What is YAML?
YAML (YAML Ain't Markup Language) is a readable format for configuration and data exchange, popular in DevOps and data science.

In [None]:
# Requires: pip install pyyaml
import yaml

## Serialize Python object to YAML
This cell shows how to save a Python dictionary as YAML.

In [None]:
config = {'version': 1, 'settings': {'theme': 'dark', 'autosave': True}}
with open('config.yaml', 'w') as f:
    yaml.dump(config, f)

## Load data from a YAML file
This cell shows how to read YAML data from a file.

In [None]:
with open('config.yaml', 'r') as f:
    loaded_config = yaml.safe_load(f)
print(loaded_config)

**Use case:** Application configuration, cloud infrastructure, and data pipelines.

## What is Pillow?
Pillow (PIL) is a Python library for opening, processing, and saving images.

In [None]:
from PIL import Image

## Create a new image
This cell creates a new image in memory.

In [None]:
img = Image.new('RGB', (100, 50), color='red')

## Save an image to a file
This cell shows how to save an image created in Python.

In [None]:
img.save('red.png')

## Open and display an image
This cell demonstrates how to open and display an image file.

In [None]:
img = Image.open('red.png')
img.show()

**Use case:** Automate image processing, create thumbnails, or analyze photos.

## What is PyPDF2?
PyPDF2 is a Python library for reading and manipulating PDF files.

In [None]:
import PyPDF2

## Extract text from a PDF file
This cell shows how to extract text from the first page of a PDF.

In [None]:
with open('example.pdf', 'rb') as f:
    reader = PyPDF2.PdfReader(f)
    page = reader.pages[0]
    print(page.extract_text())

**Use case:** Extract data from reports, automate paperwork, or combine PDFs.

## What are Binary Files?
Binary files store data in a non-text format, useful for images, audio, or custom formats.

## Write to a binary file
This cell shows how to write bytes to a binary file.

In [None]:
with open('binary_example.bin', 'wb') as f:
    f.write(b'\x00\x01\x02Hello')

## Read from a binary file
This cell demonstrates how to read bytes from a binary file.

In [None]:
with open('binary_example.bin', 'rb') as f:
    data = f.read()
print(data)

**Use case:** Storing images, audio, or any non-text data.

## What is SQLite?
SQLite is a lightweight, file-based SQL database built into Python. Great for small apps and prototyping.

In [None]:
import sqlite3

## Create a SQLite database and table
This cell creates a new database and a table.

In [None]:
conn = sqlite3.connect('example.db')
cursor = conn.cursor()
cursor.execute('CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, name TEXT)')
conn.commit()

## Insert data into a SQLite table
This cell inserts a row into the users table.

In [None]:
cursor.execute('INSERT INTO users (name) VALUES (?)', ('Alice',))
conn.commit()

## Query data from a SQLite table
This cell fetches all rows from the users table.

In [None]:
cursor.execute('SELECT * FROM users')
print(cursor.fetchall())
conn.close()

**Use case:** Local databases for apps, prototyping, or data analysis.

## What is shelve?
shelve is a Python module that provides a persistent dictionary-like object.

In [None]:
import shelve

## Store and retrieve data with shelve
This cell shows how to use shelve to persist Python objects.

In [None]:
with shelve.open('my_shelf.db') as db:
    db['key'] = {'a': 1, 'b': 2}
    print(db['key'])

**Use case:** Simple persistent storage for Python objects.

## What are tar archives?
Tar files (.tar, .tar.gz) are used to bundle and compress multiple files, common in Unix/Linux.

In [None]:
import tarfile

## Create a tar archive
This cell shows how to create a tar archive and add files.

In [None]:
with tarfile.open('archive.tar.gz', 'w:gz') as tar:
    tar.add('notes.txt')
    tar.add('person.json')

## Extract files from a tar archive
This cell shows how to extract all files from a tar archive.

In [None]:
with tarfile.open('archive.tar.gz', 'r:gz') as tar:
    tar.extractall('extracted_tar')

**Use case:** Packaging, backups, and sharing datasets in Unix/Linux environments.