# Read CSV Files

The first step to any data science project is to import your data and, often, it will be in a Comma Separated Value (CSV) format. Use this template to reduce data cleaning tasks further in your notebook by efficiently importing CSV files. For example, you can specify columns and deal with null values and dates all in one function!

Begin by uploading your CSVs to this workspace!

In [1]:
import pandas as pd

df = pd.read_csv(
    "data/sheffield_weather_station.csv",  # Replace with your CSV file path
    # The following arguments are optional and can be removed:
    # If columns aren't separated by commas, indicate the delimiter here
    sep="\s+",
    # Indicate which zero-indexed row number(s) have the column names
    header=0,
    # List of column names to use (useful for renaming columns)
    names=["year", "month", "max_c", "min_c", "af", "rain", "sun"],
    # If not all columns are needed, indicate which you need (useful for lower memory usage)
    usecols=["year", "month", "max_c", "min_c", "rain", "sun"],
    # Indicate which column(s) to use as row labels
    index_col=["year", "month"],
    # Lines starting with this string should be ignored (useful if there are file comments)
    comment="#",
    # Indicate the number of lines to skip at the start of the file (also useful for file comments)
    skiprows=None,
    # Indicate string(s) that should be recognized as NaN/NA
    na_values=["---", "unknown", "no info"],
    # Indicate which column(s) are date column(s)
    parse_dates=False,
    # Indicate number of rows to read (useful for large files)
    nrows=500,
    # Encoding to use when reading file
    encoding="utf-8",
)

df.head(10)  # Preview the first 10 lines

Unnamed: 0_level_0,Unnamed: 1_level_0,max_c,min_c,rain,sun
year,month,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1883,1,6.3,1.7,122.1,
1883,2,8.0,2.8,69.8,
1883,3,4.8,-1.6,29.6,
1883,4,12.2,3.8,74.0,
1883,5,14.7,6.2,31.2,
1883,6,17.7,9.3,66.2,
1883,7,18.8,10.5,77.6,
1883,8,19.8,10.9,32.5,
1883,9,16.8,10.0,137.4,
1883,10,12.7,6.4,102.9,


In [36]:
# Start analyzing your DataFrame!

For more information on arguments, visit pandas' [`read_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) documentation.