# Opening files and reading data

## Opening and looping over a text file

### Opening a file - straightforward way

In [38]:
# Open a text file and save into variable f 
f = open('This is just to say.txt')
print(type(f))
print(f)

<class '_io.TextIOWrapper'>
<_io.TextIOWrapper name='This is just to say.txt' mode='r' encoding='cp1252'>


In [39]:
f_str = f.read()
f_str

'I have eaten\nthe plums\nthat were in\nthe icebox\n\nand which\nyou were probably\nsaving\nfor breakfast\n\nForgive me\nthey were delicious\nso sweet\nand so cold'

In [40]:
type(f_str)

str

#### Closing a file - IMPORTANT!
When opening a file using `open()` it's __important__ to remember to close the file. Otherwise, you may have problems later on...

In [41]:
f.close()

### Opening a file - _pythonic_ way
In Python, it's best practice to use a context manager such as `with`. It will take care of opening and closing procedures, even in cases of Exceptions.

For more info see here: https://realpython.com/python-with-statement/

In [42]:
with open('This is just to say.txt') as f:
    f_str = f.read()

In [43]:
f_str

'I have eaten\nthe plums\nthat were in\nthe icebox\n\nand which\nyou were probably\nsaving\nfor breakfast\n\nForgive me\nthey were delicious\nso sweet\nand so cold'

In [44]:
# f.close()    # Now we don't need CLOSE

In [45]:
# Now we can continue working with the data as a regular python object
lines = f_str.split('\n')
lines

['I have eaten',
 'the plums',
 'that were in',
 'the icebox',
 '',
 'and which',
 'you were probably',
 'saving',
 'for breakfast',
 '',
 'Forgive me',
 'they were delicious',
 'so sweet',
 'and so cold']

In [46]:
type(f)

_io.TextIOWrapper

In [47]:
# If it's a very large file, we can work with-in the WITH command
with open('This is just to say.txt') as f:
    for each_line in f:
        print(each_line)

I have eaten

the plums

that were in

the icebox



and which

you were probably

saving

for breakfast



Forgive me

they were delicious

so sweet

and so cold


In [48]:
# Solving the double-space issue
with open('This is just to say.txt') as f:
    for each_line in f:
        print(each_line.strip())

I have eaten
the plums
that were in
the icebox

and which
you were probably
saving
for breakfast

Forgive me
they were delicious
so sweet
and so cold


`strip()` method removes specified characters (or whitespace by default) from the beginning and end of a string, returning a new string.

#### EXERCISE:
Write a program that asks the user for a target word (input). The program then reads the file and counts how many times that target-word appears.

In [49]:
each_line

'and so cold'

In [50]:
f_str

'I have eaten\nthe plums\nthat were in\nthe icebox\n\nand which\nyou were probably\nsaving\nfor breakfast\n\nForgive me\nthey were delicious\nso sweet\nand so cold'

## Reading tabular data

### `pathlib` package
`pathlib` provides a number of key classes and methods for working file paths (directories) in Python.

In [51]:
import pathlib

In [52]:
home = pathlib.Path.home()
home

WindowsPath('C:/Users/techl')

In [53]:
cwd = pathlib.Path.cwd() # Current working directory
cwd

WindowsPath('C:/Users/techl/OneDrive - University of Haifa/Professional/Teaching/Python Course/COGNITION/Cognition Sem B 2024/Lessons/Lesson 8 - Reading files')

In [54]:
my_file_path = cwd / 'data/Flanker/Study1_P1Flanker1.csv'
my_file_path

WindowsPath('C:/Users/techl/OneDrive - University of Haifa/Professional/Teaching/Python Course/COGNITION/Cognition Sem B 2024/Lessons/Lesson 8 - Reading files/data/Flanker/Study1_P1Flanker1.csv')

In [55]:
my_file_path = cwd / 'data' / 'Flanker' / 'Study1_P1Flanker1.csv'
my_file_path

WindowsPath('C:/Users/techl/OneDrive - University of Haifa/Professional/Teaching/Python Course/COGNITION/Cognition Sem B 2024/Lessons/Lesson 8 - Reading files/data/Flanker/Study1_P1Flanker1.csv')

### Reading tabular data file
`pathlib` makes it easy to write short readable code for reading files.

In [56]:
with open(my_file_path) as f:
    data = f.read()
data[:500]

'1,1,2,1,1,1.7112\n1,2,2,2,1,0.55779\n1,3,1,1,1,0.40306\n1,4,1,2,0,abcd\n1,5,2,2,1,0.49219\n1,6,1,2,1,0.5769\n1,7,1,1,1,0.44243\n1,8,2,0,1,0.36883\n1,9,2,1,1,0.36999\n1,10,2,1,1,0.42133\n1,11,2,2,1,0.45723\n1,12,2,0,1,0.36671\n1,13,2,0,1,0.42714\n1,14,1,2,0,0.36884\n1,15,1,0,1,0.42787\n1,16,2,2,1,0.704\n1,17,2,0,1,0.37218\n1,18,1,2,1,0.48593\n1,19,1,1,1,0.33243\n1,20,1,0,1,0.294\n1,21,1,1,1,0.28793\n1,22,2,1,0,0.37779\n1,23,1,2,1,0.57301\n1,24,2,1,1,0.47621\n1,25,2,2,1,0.4498\n1,26,2,1,1,0.37619\n1,27,2,0,1,0.33856\n1,28,1'

In [57]:
with open('C:/Users/techl/OneDrive - University of Haifa/Professional/Teaching/Python Course/COGNITION/Cognition Sem B 2024/Lessons/Lesson 8 - Reading files/data/Flanker/Study1_P1Flanker1.csv') as f:
    data = f.read()
data[:500]

'1,1,2,1,1,1.7112\n1,2,2,2,1,0.55779\n1,3,1,1,1,0.40306\n1,4,1,2,0,abcd\n1,5,2,2,1,0.49219\n1,6,1,2,1,0.5769\n1,7,1,1,1,0.44243\n1,8,2,0,1,0.36883\n1,9,2,1,1,0.36999\n1,10,2,1,1,0.42133\n1,11,2,2,1,0.45723\n1,12,2,0,1,0.36671\n1,13,2,0,1,0.42714\n1,14,1,2,0,0.36884\n1,15,1,0,1,0.42787\n1,16,2,2,1,0.704\n1,17,2,0,1,0.37218\n1,18,1,2,1,0.48593\n1,19,1,1,1,0.33243\n1,20,1,0,1,0.294\n1,21,1,1,1,0.28793\n1,22,2,1,0,0.37779\n1,23,1,2,1,0.57301\n1,24,2,1,1,0.47621\n1,25,2,2,1,0.4498\n1,26,2,1,1,0.37619\n1,27,2,0,1,0.33856\n1,28,1'

In [58]:
type(data)

str

In [59]:
print(f)

<_io.TextIOWrapper name='C:/Users/techl/OneDrive - University of Haifa/Professional/Teaching/Python Course/COGNITION/Cognition Sem B 2024/Lessons/Lesson 8 - Reading files/data/Flanker/Study1_P1Flanker1.csv' mode='r' encoding='cp1252'>


In [60]:
type(f)

_io.TextIOWrapper

In [61]:
len(data)

13760

In [62]:
# Option B - go line by line
with open(my_file_path) as f:
    for each_line in f:
        print(each_line)

1,1,2,1,1,1.7112

1,2,2,2,1,0.55779

1,3,1,1,1,0.40306

1,4,1,2,0,abcd

1,5,2,2,1,0.49219

1,6,1,2,1,0.5769

1,7,1,1,1,0.44243

1,8,2,0,1,0.36883

1,9,2,1,1,0.36999

1,10,2,1,1,0.42133

1,11,2,2,1,0.45723

1,12,2,0,1,0.36671

1,13,2,0,1,0.42714

1,14,1,2,0,0.36884

1,15,1,0,1,0.42787

1,16,2,2,1,0.704

1,17,2,0,1,0.37218

1,18,1,2,1,0.48593

1,19,1,1,1,0.33243

1,20,1,0,1,0.294

1,21,1,1,1,0.28793

1,22,2,1,0,0.37779

1,23,1,2,1,0.57301

1,24,2,1,1,0.47621

1,25,2,2,1,0.4498

1,26,2,1,1,0.37619

1,27,2,0,1,0.33856

1,28,1,2,1,0.38075

1,29,1,1,1,0.36834

1,30,1,0,1,0.38119

1,31,2,2,1,0.55301

1,32,1,1,1,0.44781

1,33,2,1,1,0.42136

1,34,2,0,1,0.46837

1,35,2,0,1,0.34416

1,36,2,2,0,0.44142

1,37,2,1,1,0.7273

1,38,1,0,1,0.46812

1,39,1,2,1,0.61684

1,40,2,2,0,0.35365

1,41,2,0,1,0.46057

1,42,1,0,1,0.45027

1,43,1,1,1,0.44663

1,44,1,1,1,0.35685

1,45,1,2,1,0.52075

1,46,2,1,1,0.48839

1,47,2,2,1,0.472

1,48,1,0,0,0.30759

1,49,2,2,1,0.37964

1,50,1,0,1,0.63937

1,51,1,1,1,0.41052

1,

In [63]:
type(each_line)

str

In [37]:
data[:100]

['1,1,2,1,1,1.7112',
 '1,2,2,2,1,0.55779',
 '1,3,1,1,1,0.40306',
 '1,4,1,2,0,abcd',
 '1,5,2,2,1,0.49219',
 '1,6,1,2,1,0.5769',
 '1,7,1,1,1,0.44243',
 '1,8,2,0,1,0.36883',
 '1,9,2,1,1,0.36999',
 '1,10,2,1,1,0.42133',
 '1,11,2,2,1,0.45723',
 '1,12,2,0,1,0.36671',
 '1,13,2,0,1,0.42714',
 '1,14,1,2,0,0.36884',
 '1,15,1,0,1,0.42787',
 '1,16,2,2,1,0.704',
 '1,17,2,0,1,0.37218',
 '1,18,1,2,1,0.48593',
 '1,19,1,1,1,0.33243',
 '1,20,1,0,1,0.294',
 '1,21,1,1,1,0.28793',
 '1,22,2,1,0,0.37779',
 '1,23,1,2,1,0.57301',
 '1,24,2,1,1,0.47621',
 '1,25,2,2,1,0.4498',
 '1,26,2,1,1,0.37619',
 '1,27,2,0,1,0.33856',
 '1,28,1,2,1,0.38075',
 '1,29,1,1,1,0.36834',
 '1,30,1,0,1,0.38119',
 '1,31,2,2,1,0.55301',
 '1,32,1,1,1,0.44781',
 '1,33,2,1,1,0.42136',
 '1,34,2,0,1,0.46837',
 '1,35,2,0,1,0.34416',
 '1,36,2,2,0,0.44142',
 '1,37,2,1,1,0.7273',
 '1,38,1,0,1,0.46812',
 '1,39,1,2,1,0.61684',
 '1,40,2,2,0,0.35365',
 '1,41,2,0,1,0.46057',
 '1,42,1,0,1,0.45027',
 '1,43,1,1,1,0.44663',
 '1,44,1,1,1,0.35685',
 '1,45,1

In [64]:
# Option C - if the file is not too long
with open(my_file_path) as f:
    data = f.read().split('\n')

data[:10] # Print first 10 rows of data

['1,1,2,1,1,1.7112',
 '1,2,2,2,1,0.55779',
 '1,3,1,1,1,0.40306',
 '1,4,1,2,0,abcd',
 '1,5,2,2,1,0.49219',
 '1,6,1,2,1,0.5769',
 '1,7,1,1,1,0.44243',
 '1,8,2,0,1,0.36883',
 '1,9,2,1,1,0.36999',
 '1,10,2,1,1,0.42133']

#### Eriksen's Flanker task
The flanker task:

    "In the Flanker task, arrows point either to the left or the right, and the subject is instructed to press one of two buttons indicating the direction of the arrow in the middle. If it’s pointing to the left, the subject presses the “left” button; if it’s pointing to the right, the subject presses the “right” button. The middle arrow is flanked by other arrows which either point in the same direction as the middle arrow, or point in the opposite direction from the middle arrow."

![](https://andysbrainbook.readthedocs.io/en/stable/_images/Flanker_Example.png)

    "An example of the two conditions of the Flanker task. In the Incongruent condition, the central arrow (which the subject is focusing on) points in the opposite direction as the flanking arrows; in the Congruent condition, the central arrow points in the same direction as the flanking arrows. In this example the correct response in the Incongruent condition would be to push the “left” button, and the correct response in the Congruent condition would be to push the “right” button."

## Exercise
- Read the file `Study1_P1Flanker2.csv` ( <-- remember to look at number __2__ file)
- Calculate the mean (i.e., average) reaction time (RT) across the task.

*additional*:
- Can you calculate the mean accuracy per condition (Congruent, Incongruent, Neutral)?

Use the above code examples and `pathlib` package. Remember to look at README.txt to remember what the column headers are.

In [81]:
my_file_path

WindowsPath('C:/Users/techl/OneDrive - University of Haifa/Professional/Teaching/Python Course/COGNITION/Cognition Sem B 2024/Lessons/Lesson 8 - Reading files/data/Flanker/Study1_P1Flanker2.csv')

In [65]:
# Load data as list of lines
cwd = pathlib.Path.cwd()
my_file_path = cwd / 'data' / 'Flanker' / 'Study1_P1Flanker2.csv'

with open(my_file_path) as f:
    data = f.read().split('\n')

In [86]:
line_as_list

['5', '144', '1', '2', '1', '0.80004']

In [88]:
# Isolate and save all RTs as float
cong_RTs = []
neutral_RTs = []
incong_RTs = []

for each_line in data[:-1]:
    line_as_list = each_line.split(',')
    condition = line_as_list[3]
    rt = float(line_as_list[5])
    
    if condition=='0':
        cong_RTs.append(rt)
    elif condition=='1':
        neutral_RTs.append(rt)
    else:
        incong_RTs.append(rt)

In [91]:
# Calculate mean RT
print(0, sum(cong_RTs) / len(cong_RTs))
print(1, sum(neutral_RTs) / len(neutral_RTs))
print(2, sum(incong_RTs) / len(incong_RTs))

0 0.25848716250000003
1 0.2458448395833332
2 0.2702616312500001


## Saving result to a file

In [None]:
save_path = cwd / 'output' / 'meanRT.txt'

# An OPEN function with "x" creates a new file
with open(save_path, mode="x") as f:
    out_txt_ntr = 'The mean neutral is ' + str(ntr_mean)
    f.write(out_txt_ntr)