## Read and Write to files

When we use the `open()` function in Python, we pass two argument: __1)__ the file and __2)__ the mode, e.g., `open("mydiary.text, "r")`. Some times, you may need to declare also __3)__ the encoding.

__1.__ The file should always contain the extension of the file (e.g., ".txt" or ".html"). If the file we want to open is not in the same directory as the Jupyter Notebook, we must write the directory of the file. E.g.:

For Mac users: 

- `open("/Users/Georgios/Desktop/Downloads/GitHub/4/mydiary.text", "r")`

For Windows users: 

- `open("C:/Users/Georgios/Downloads/GitHub/4/mydiary.text", "r")`
    
    or:
    
- `open(r"C:\Users\Georgios\Downloads\GitHub\4\mydiary.text", "r")`
      
__2.__ There are several modes. The most useful:
- `"r"`: This is the default mode. It opens a file only for reading.
- `"a"`: It opens a file for appending.
- `"w"`: It opens a file for writing.

__3__ To specify the encoding, you can pass the following parameters:
- encoding="utf-8"
- encoding="ascii"

### Read

#### Reading a file (slow way)

In [1]:
f = open("myfile.txt","r")
text = f.readlines()
f.close()

In [21]:
print(text)

['Γεώργιος Τσολάκης']


#### Reading a file (fast way)

In [22]:
with open("myfile.txt", "r", encoding="utf-8") as f:
    text = f.readlines()

In [23]:
print(text)

['When Mr Bilbo Baggins of Bag End announced that he would shortly be celebrating his eleventyifirst birthday with a party of special magnificence, there was much talk and excitement in Hobbiton.\n']


#### Reading a file with encoding

In [24]:
with open("myGreekfile.txt", "r") as f:
    text = f.readlines()

You have probably received an error messge. Try the opening the file with encoding.

In [25]:
with open("myGreekfile.txt", "r", encoding="utf-8") as f:
    text = f.readlines()

In [26]:
print(text)

['Γεώργιος Τσολάκης']


#### Writing a file (slow way)

### Write

#### Append
The following code will either create a new file and will write a text or will _append_ the text to an existing file. Run the cell once and open the file in your computer (not with Python). Then close the file and run the following cell three times. Open the file again. What do you notice?

In [27]:
f = open("newfile.txt", "a")
f.write("I love Digital Humanities!")
f.close()

#### Writing a file
The following code will either create a new file and will write a text or will _overwrite_ the text to an existing file.

In [28]:
f = open("newfile.txt", "w")
f.write("I love Digital Humanities!")
f.close()

Regardless of how many times you run the code, the result will remain the same.

#### Writing a file (fast way)

In [29]:
with open("newfile.txt", "w") as f:
    f.write("I love Digital Humanities!")

Make the necessary changes to the following code to run without errors.

In [30]:
with open("newfile.txt", "w") as f:
    f.write("I love Digital Humanities! Γεώργιος is teaching the course!")

## Pandas and CSV/EXCEL

For some types of files, we need specific libraries. Those libraries are included by default with the Python distribution, while other need to be installed manually.

For files in a tabular format, like .csv (comma seperated values) or .xslx (Excel), Pandas are a powerful tool. In order to install Pandas:

For Mac users:
- Click the Launchpad icon in the Dock, type "Terminal" in the search field, then click Terminal or (Click the magnifying glass icon in the menu bar or press Command + Space. Type "Terminal" and double-click the Terminal app).
- Type `pip3 install pandas` and press Return.

For Windows users:
- Press the Windows button and type "cmd". 
- Open the `Command Prompt`
- Type `pip install pandas` and press enter.


#### Libraries - Imports

Installing a library is not enough. We must import it.

In [31]:
import pandas as pd # This is a convention in Python community.

#### Read

Download the csv of [Roman Amphitheaters](https://github.com/roman-amphitheaters/roman-amphitheaters/blob/main/roman-amphitheaters.csv) and place it in the same directory as your notebook. Run the following cells.

In [35]:
df = pd.read_csv('roman-amphitheaters.csv') # 

`df` stands for dataframe, i.e., rows and columns of data.

In [37]:
df

Unnamed: 0,id,title,label,latintoponym,pleiades,welchid,golvinid,buildingtype,chronogroup,secondcentury,...,amphitheatrum,dimensionsunknown,arenamajor,arenaminor,extmajor,extminor,exteriorheight,longitude,latitude,elevation
0,duraEuroposAmphitheater,Amphitheater at Dura Europos,Dura Europos,Dura Europus,https://pleiades.stoa.org/places/893989,,129.0,amphitheater,severan,False,...,https://amphi-theatrum.de/1449.html,False,31.0,25.0,50.0,44.0,,40.728926,34.749855,223
1,arlesAmphitheater,Amphitheater at Arles,Arles,Arelate,https://pleiades.stoa.org/places/148217,,154.0,amphitheater,flavian,True,...,https://amphi-theatrum.de/1371.html,False,47.0,32.0,136.0,107.0,,4.631111,43.677778,21
2,lyonAmphitheater,Amphitheater at Lyon,Lyon,Lugdunum,https://pleiades.stoa.org/places/167717,,171.0,amphitheater,second-century,True,...,https://amphi-theatrum.de/1416.html,False,67.6,42.0,143.0,117.0,,4.830556,45.770556,206
3,ludusMagnusArena,Ludus Magnus Arena,Ludus Magnus,Ludus Magnus,https://pleiades.stoa.org/places/423025,,,practice-arena,imperial,False,...,https://amphi-theatrum.de/1708.html,True,,,,,,12.494913,41.889950,22
4,romeFlavianAmphitheater,Flavian Amphitheater at Rome,Flavian Amphitheater,Roma,https://pleiades.stoa.org/places/423025,,152.0,amphitheater,flavian,True,...,https://amphi-theatrum.de/1373.html,False,83.0,48.0,189.0,156.0,52.0,12.492269,41.890169,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
270,saintGeorgesDuBoisAmphitheater,Amphitheater at Saint-Georges-du-Bois,Saint-Georges-du-Bois,,,,103.0,amphitheater,first-century,True,...,https://amphi-theatrum.de/1529.html,False,54.0,30.0,65.0,50.0,,-0.749919,46.142723,39
271,toledoAmphitheater,Amphitheater at Toledo,Toledo,Toletum,https://pleiades.stoa.org/places/266066,,,amphitheater,imperial,True,...,https://amphi-theatrum.de/3090.html,True,,,,,,-4.022888,39.865349,482
272,kaiseraugustAmphitheater,Amphitheater at Kaiseraugst,Kaiseraugst,Castrum Rauracense,https://pleiades.stoa.org/places/81716458,,,amphitheater,fourth-century,False,...,https://amphi-theatrum.de/3066.html,False,,,50.0,40.0,,7.721596,47.540822,482
273,ammaiaAmphitheater,Amphitheater at Ammaia,Ammaia,Ammaia,https://pleiades.stoa.org/places/255975,,,amphitheater,imperial,True,...,https://amphi-theatrum.de/3020.html,False,,,60.0,,,-7.391970,39.369905,566


In [38]:
df.head()

Unnamed: 0,id,title,label,latintoponym,pleiades,welchid,golvinid,buildingtype,chronogroup,secondcentury,...,amphitheatrum,dimensionsunknown,arenamajor,arenaminor,extmajor,extminor,exteriorheight,longitude,latitude,elevation
0,duraEuroposAmphitheater,Amphitheater at Dura Europos,Dura Europos,Dura Europus,https://pleiades.stoa.org/places/893989,,129.0,amphitheater,severan,False,...,https://amphi-theatrum.de/1449.html,False,31.0,25.0,50.0,44.0,,40.728926,34.749855,223
1,arlesAmphitheater,Amphitheater at Arles,Arles,Arelate,https://pleiades.stoa.org/places/148217,,154.0,amphitheater,flavian,True,...,https://amphi-theatrum.de/1371.html,False,47.0,32.0,136.0,107.0,,4.631111,43.677778,21
2,lyonAmphitheater,Amphitheater at Lyon,Lyon,Lugdunum,https://pleiades.stoa.org/places/167717,,171.0,amphitheater,second-century,True,...,https://amphi-theatrum.de/1416.html,False,67.6,42.0,143.0,117.0,,4.830556,45.770556,206
3,ludusMagnusArena,Ludus Magnus Arena,Ludus Magnus,Ludus Magnus,https://pleiades.stoa.org/places/423025,,,practice-arena,imperial,False,...,https://amphi-theatrum.de/1708.html,True,,,,,,12.494913,41.88995,22
4,romeFlavianAmphitheater,Flavian Amphitheater at Rome,Flavian Amphitheater,Roma,https://pleiades.stoa.org/places/423025,,152.0,amphitheater,flavian,True,...,https://amphi-theatrum.de/1373.html,False,83.0,48.0,189.0,156.0,52.0,12.492269,41.890169,22


In [39]:
df.tail()

Unnamed: 0,id,title,label,latintoponym,pleiades,welchid,golvinid,buildingtype,chronogroup,secondcentury,...,amphitheatrum,dimensionsunknown,arenamajor,arenaminor,extmajor,extminor,exteriorheight,longitude,latitude,elevation
270,saintGeorgesDuBoisAmphitheater,Amphitheater at Saint-Georges-du-Bois,Saint-Georges-du-Bois,,,,103.0,amphitheater,first-century,True,...,https://amphi-theatrum.de/1529.html,False,54.0,30.0,65.0,50.0,,-0.749919,46.142723,39
271,toledoAmphitheater,Amphitheater at Toledo,Toledo,Toletum,https://pleiades.stoa.org/places/266066,,,amphitheater,imperial,True,...,https://amphi-theatrum.de/3090.html,True,,,,,,-4.022888,39.865349,482
272,kaiseraugustAmphitheater,Amphitheater at Kaiseraugst,Kaiseraugst,Castrum Rauracense,https://pleiades.stoa.org/places/81716458,,,amphitheater,fourth-century,False,...,https://amphi-theatrum.de/3066.html,False,,,50.0,40.0,,7.721596,47.540822,482
273,ammaiaAmphitheater,Amphitheater at Ammaia,Ammaia,Ammaia,https://pleiades.stoa.org/places/255975,,,amphitheater,imperial,True,...,https://amphi-theatrum.de/3020.html,False,,,60.0,,,-7.39197,39.369905,566
274,contributaAmphitheater,Amphitheater at Contributa Iulia Ugultunia,Contributa,Contributa Iulia Ugultunia,https://pleiades.stoa.org/places/256126,,,amphitheater,imperial,True,...,https://amphi-theatrum.de/3093.html,False,,,72.0,,,-6.38932,38.347751,501


#### Data types in our csv

In [40]:
df.shape # notice that this is an attribute so no parentheses

(275, 24)

In [41]:
df.info() # notice that this is a methon, so we need parentheses

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 275 entries, 0 to 274
Data columns (total 24 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   id                 275 non-null    object 
 1   title              275 non-null    object 
 2   label              275 non-null    object 
 3   latintoponym       261 non-null    object 
 4   pleiades           274 non-null    object 
 5   welchid            19 non-null     float64
 6   golvinid           166 non-null    float64
 7   buildingtype       275 non-null    object 
 8   chronogroup        275 non-null    object 
 9   secondcentury      275 non-null    bool   
 10  capacity           153 non-null    float64
 11  modcountry         275 non-null    object 
 12  romanregion        274 non-null    object 
 13  zotero             56 non-null     object 
 14  amphitheatrum      239 non-null    object 
 15  dimensionsunknown  275 non-null    bool   
 16  arenamajor         184 non