# Reading Data using Pandas

Pandas is a powerful Python library used for data manipulation and analysis. One of its core functionalities is reading data from various file formats into DataFrame objects.

```{admonition} **Syntax:**

import pandas as pd

pd.read_csv(filepath_or_buffer, sep=',', delimiter=None, header='infer', names=None, ...)
```

```{admonition} **Description:**
:class: hint
`pd.read_csv()` reads a comma-separated values (csv) file into a DataFrame.

Key arguments include `filepath_or_buffer` (path to the file), `sep` (delimiter), `header` (row number for header), `names` (list of column names), etc.
```

## Example: Reading the Gapminder Dataset

Let's use `pd.read_csv()` to read the Gapminder dataset into a DataFrame.

### Gapminder Dataset

The Gapminder dataset contains historical data about various countries around the world. It includes key indicators such as life expectancy, population, and GDP per capita. This dataset is often used to study and visualize global development trends over time.

#### Columns:
- `country`: The name of the country.
- `continent`: The continent to which the country belongs.
- `year`: The year of the observation.
- `lifeExp`: Life expectancy at birth in years.
- `pop`: Total population.
- `gdpPercap`: GDP per capita (inflation-adjusted).

The data spans from 1952 to 2007 and provides valuable insights into the socio-economic progress of countries globally.

In [1]:
import pandas as pd

# URL to the raw CSV file on GitHub
url = 'https://raw.githubusercontent.com/kirenz/datasets/master/gapminder.csv'

# Read the CSV file into a DataFrame
df = pd.read_csv(url)

# Display the first few rows of the DataFrame
print(df.head())

       country continent  year  lifeExp       pop   gdpPercap
0  Afghanistan      Asia  1952   28.801   8425333  779.445314
1  Afghanistan      Asia  1957   30.332   9240934  820.853030
2  Afghanistan      Asia  1962   31.997  10267083  853.100710
3  Afghanistan      Asia  1967   34.020  11537966  836.197138
4  Afghanistan      Asia  1972   36.088  13079460  739.981106


## Example: Using `pd.read_csv()` with Different Arguments

In [5]:
import pandas as pd

# URL to the raw CSV file on GitHub
url = 'https://raw.githubusercontent.com/kirenz/datasets/master/gapminder.csv'

# Using sep (delimiter) to specify the delimiter explicitly
df_sep = pd.read_csv(url, sep=',')

# Using delimiter to specify the delimiter explicitly
df_delimiter = pd.read_csv(url, delimiter=',')

# Using header to specify the row number(s) to use as the header
df_header = pd.read_csv(url, header=0)  # assuming the first row is the header

# Using names to specify column names explicitly
column_names = ['country', 'continent', 'year', 'lifeExp', 'pop', 'gdpPercap']
df_names = pd.read_csv(url, names=column_names)

# Displaying the first few rows of each DataFrame for verification
print("DataFrame using sep argument:")
print(df_sep.head())

print("\nDataFrame using delimiter argument:")
print(df_delimiter.head())

print("\nDataFrame using header argument:")
print(df_header.head())

print("\nDataFrame using names argument:")
print(df_names.head())

DataFrame using sep argument:
       country continent  year  lifeExp       pop   gdpPercap
0  Afghanistan      Asia  1952   28.801   8425333  779.445314
1  Afghanistan      Asia  1957   30.332   9240934  820.853030
2  Afghanistan      Asia  1962   31.997  10267083  853.100710
3  Afghanistan      Asia  1967   34.020  11537966  836.197138
4  Afghanistan      Asia  1972   36.088  13079460  739.981106

DataFrame using delimiter argument:
       country continent  year  lifeExp       pop   gdpPercap
0  Afghanistan      Asia  1952   28.801   8425333  779.445314
1  Afghanistan      Asia  1957   30.332   9240934  820.853030
2  Afghanistan      Asia  1962   31.997  10267083  853.100710
3  Afghanistan      Asia  1967   34.020  11537966  836.197138
4  Afghanistan      Asia  1972   36.088  13079460  739.981106

DataFrame using header argument:
       country continent  year  lifeExp       pop   gdpPercap
0  Afghanistan      Asia  1952   28.801   8425333  779.445314
1  Afghanistan      Asia  1957 