## Using read_csv and read_table to:
* parse csv files
* parse pipe delimited files
* select specific fields
* specify input types for specific fields

In [None]:
import numpy as np
import pandas as pd

Default usage of read_csv allows for simply identifying the file to read.

In [None]:
# pd.read_csv('KPHX.csv')

### Format the input data

```awk '{ printf("%s|%s|%s|reserved1|reserved2|reserved\n", NR <= 1 ? "id" : NR - 1 + "", NR <= 1 ? "city" : "PHX", $0); }' KPHX.dat >KPHX_with_reserved.dat```

There are many options to read table. Here I show specifying a field separator and specific columns to read into the DataFrame.

_See [read_table](https://pandas.pydata.org/docs/reference/api/pandas.read_table.html)_

In [None]:
df_with_reserved = pd.read_table('../data/KPHX_with_reserved.dat', sep='|', header=0,
    usecols=[
        'id','city','date','actual_mean_temp','actual_min_temp','actual_max_temp','actual_precipitation','average_precipitation','record_precipitation','reserved2'
        ],
    dtype = {
        'id': 'int', 'city': 'string',
        #'date': '?', # let parser handle the converion - see https://stackoverflow.com/questions/21269399/datetime-dtypes-in-pandas-read-csv
        'actual_mean_temp': 'int8', 'actual_min_temp': 'int8','actual_max_temp': 'int8',
        # 'actual_precipitation','average_precipitation','record_precipitation',
        'reserved2': 'string'
    },
    # These next three are needed to parse and optimize datetime input handling
    parse_dates = [2],
    infer_datetime_format = True,
    date_parser = pd.to_datetime
    )

In [None]:
df_with_reserved

In [None]:
df_with_reserved.dtypes