In [None]:
import pandas
print('pandas',pandas.__version__)

In [None]:
!head RollingSystemDemand_20180901_0129.csv

The "VD" entry is being used as an index because the header has two columns

Tell Pandas to not use the first column as the index

In [None]:
dframe = pandas.read_csv("RollingSystemDemand_20180901_0129.csv",
                         index_col=False)
dframe.head()

Hmm, that's not quite what I intended.

Rather than Pandas trying to figure out what's going on, tell Pandas to skip the first row

`skiprows` : Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file.

In [None]:
dframe = pandas.read_csv("RollingSystemDemand_20180901_0129.csv",
                         index_col=False,
                         skiprows=1)
dframe.head()

And now Pandas assumes the first row is the header. Let's disable that

In [None]:
dframe = pandas.read_csv("RollingSystemDemand_20180901_0129.csv",
                         index_col=False,
                         skiprows=1, 
                         header=None)
dframe.head()

Now we can set the column labels

In [None]:
dframe.columns=['VD','time of measurement','value']
dframe.head()

Let's check on the status of the data types in each column

In [None]:
dframe.dtypes

_Lesson_: abstraction frameworks are convenient when the assumptions they make are correct. 

(Different confusion might have arisen if we had chosen to simply read in the file manually. Here we are using read_csv.)

Change the type from "int" to string so that we can then convert to datetime

In [None]:
# https://stackoverflow.com/questions/17950374/converting-a-column-within-pandas-dataframe-from-int-to-string
dframe['time of measurement']=dframe['time of measurement'].apply(str)

Check the data type of the columns

In [None]:
dframe.dtypes

In [None]:
dframe.head()

Now we can apply the conversion of the time column from string to datetime

In [None]:
pandas.to_datetime(dframe['time of measurement'],format='%Y%m%d%H%M%S')

Still getting errors!

A similar issue was solved here:
https://www.kaggle.com/najagumbi/data-cleaning-challenge-parsing-dates-v2

In [None]:
pandas.to_datetime(dframe['time of measurement'],format='%Y%m%d%H%M%S',errors="coerce")

To confirm, inspect the bottom of the CSV

In [None]:
!tail RollingSystemDemand_20180901_0129.csv