# 4. Importing and Exporting Data

In this notebook, we will explore how to import data from various sources (CSV, Excel, JSON, SQL, APIs) into Pandas DataFrames and export data to different formats. We will also discuss common errors and how to handle them.

## Topics Covered:
- Reading data from CSV, Excel, JSON, SQL, and APIs
- Exporting data to CSV, Excel, and other formats
- Handling file paths and common errors
- Practical: Reading and saving datasets

## Reading Data from CSV

Comma-Separated Values (CSV) files are one of the most common formats for tabular data.
We will use the COVID-19 dataset from Indonesia as an example.

### Steps:
1. Use the `read_csv` function from Pandas.
2. Specify the file path.
3. Use parameters like `delimiter`, `header`, and `index_col` as needed.

In [None]:
import pandas as pd

# Reading data from a CSV file
csv_file = '../DataSets/Data_COVID19_Indonesia.csv'
try:
    covid_data = pd.read_csv(csv_file)
    print(covid_data.head())
except FileNotFoundError:
    print(f'Error: The file {csv_file} does not exist. Please check the file path.')

## Reading Data from Excel

Excel files are often used for sharing structured data. Pandas provides the `read_excel` function to work with Excel files.

We will use the IMF Investment and Capital Stock dataset as an example.

### Steps:
1. Use the `read_excel` function from Pandas.
2. Specify the sheet name using the `sheet_name` parameter.
3. Check for missing values and column types.

In [None]:
# Reading data from an Excel file
excel_file = '../DataSets/IMFInvestmentandCapitalStockDataset2021.xlsx'
try:
    imf_data = pd.read_excel(excel_file, sheet_name='Datasets')
    print(imf_data.head())
except FileNotFoundError:
    print(f'Error: The file {excel_file} does not exist. Please check the file path.')

## Reading Data from JSON

JSON (JavaScript Object Notation) is widely used for APIs and web data. Pandas can easily handle JSON files using the `read_json` function.

We will use a mock signup dataset in JSON format as an example.

### Steps:
1. Use the `read_json` function from Pandas.
2. Specify the file path.
3. Use parameters like `orient` for nested JSON files.

In [None]:
# Reading data from a JSON file
json_file = '../DataSets/users.JSON'
try:
    json_data = pd.read_json(json_file)
    print(json_data.head())
except ValueError:
    print(f'Error: Unable to parse JSON file {json_file}. Check the file content.')

## Reading Data from SQL

Structured Query Language (SQL) databases are widely used for storing large datasets. Pandas provides the `read_sql_query` function to query databases.

We will use a mock signup dataset stored in an SQL database as an example.

### Steps:
1. Use the `sqlite3` library to connect to the database.
2. Write a SQL query to fetch data.
3. Use the `read_sql_query` function to load the data into a Pandas DataFrame.

In [None]:
# Reading data from an SQL database
import sqlite3

# Connect to the database
sql_file = '../DataSets/mock_signup.db'
try:
    connection = sqlite3.connect(sql_file)
    sql_query = 'SELECT * FROM mock_signup'
    sql_data = pd.read_sql_query(sql_query, connection)
    print(sql_data.head())
except sqlite3.DatabaseError:
    print(f'Error: Unable to connect to the database {sql_file}. Check the file and query.')

## Exporting Data to CSV, Excel, and Other Formats

Pandas makes it easy to export data to different formats using methods like `to_csv` and `to_excel`.

### Steps:
1. Use the appropriate method based on the desired format.
2. Specify the file path and optional parameters like `index`.

In [None]:
# Exporting data to a CSV file
output_csv_file = '../tmp/exported_covid_data.csv'
covid_data.to_csv(output_csv_file, index=False)
print(f'Data exported to {output_csv_file}')

# Exporting data to an Excel file
output_excel_file = '../tmp/exported_imf_data.xlsx'
imf_data.to_excel(output_excel_file, index=False)
print(f'Data exported to {output_excel_file}')

## Handling File Paths and Common Errors

When working with files, common errors include:

- **FileNotFoundError**: Ensure the file path is correct. Use absolute paths if needed.
- **UnsupportedFormatError**: Verify the file format is supported by Pandas.
- **PermissionError**: Check if you have write permissions for the directory when exporting files.
- **ValueError**: Ensure the data format matches the expected input for functions.

## Practical: Reading and Saving Datasets

Let’s combine all the knowledge to read the COVID dataset, filter data, and export it.

### Steps:
1. Read the CSV file using `read_csv`.
2. Filter rows where the total cases exceed 10,000.
3. Export the filtered data to a new CSV file.

In [None]:
# Practical example
# Filter COVID data for provinces with more than 10,000 total cases
filtered_data = covid_data[covid_data['Total Cases'] > 10000]

# Export the filtered data to a new CSV file
filtered_file = '../tmp/filtered_covid_data.csv'
filtered_data.to_csv(filtered_file, index=False)
print(f'Filtered data exported successfully to {filtered_file}!')