# Reading data from a file

## Open an existing file

The file is open in the read-only mode, i.e. no modifications to the file can be done


The file below contains information on the temperature (in degrees Celsius), the relative humidity (in %) and the preassure (in mbar) from a Bosch BME680 sensor, as well as the date and time when the data was read out.

The columns of the file are:

| date | time | temperature | humidyty | pressure |
| --- | --- | --- | --- | --- |


In [None]:
file = open("/nfs/dust/cms/user/walsh/analysis/data/rasp6/bme680.dat", "r") 

## Read all contents of the file

The method "read()" simply lists the contents of the file

In [None]:
file.read()

### Moving back to the beginning of the file

Reading a file makes a pointer to move to the last byte read.
The command above moved the pointer to the last byte of the file. 
To move back to the beginning of the file use the command seek.

In [None]:
file.seek(0,0)

## Read one line at a time in sequence of the file

The command readline() reads one line of the file, at the position of the pointer (i.e. last byte read) and moves the pointer to the beginning of the next line.


In [None]:
file.readline()

In [None]:
file.readline()

In [None]:
file.readline()

Notice the "newline" character "\n" at the end of each line. For the purposes of data analysis this is unwanted.
When reading line by line, a possibility is to use the method "rstrip()"


In [None]:
file.readline().rstrip()

## Read all lines into a list

Each line will be an entry of the list that we are calling "data"

In [None]:
file.seek(0,0)
data = file.readlines()
data

Here again at the end of each line there is a 'newline' character '\n'. For the purposes of data analysis this is unwanted.

Using the method "read()" combined with "splitlines()" solves this problem.

In [None]:
file.seek(0,0)
data = file.read().splitlines()
data

## Closing the file if it is not needed anymore

In [None]:
file.close()

### Split measurements

For data analysis it is important to be able to access the individual measurements at a certain time. For that we can make a list for each line.

In [None]:
for line in data:
    fields = line.split()
    print(fields)

In [None]:
for line in data:
    fields = line.split()
    date = fields[0]
    time = fields[1]
    temperature = fields[2]
    humidity = fields[3]
    pressure = fields[4]

    print(time,date,temperature)