# PDFs and Spreadsheets
While it is possible to export excel files and google spreadsheets to .csv files, it only exports the raw data.

Libraries to consider to work with:
- Pandas: Data analysis library.
- Openpyxl: Designed to work specifically with Excel files.
- Gogle Sheets Python API: Direct Python interface for working with Google Spreadsheets.

In [26]:
import csv

### Open the file

In [27]:
data = open('Practice_Files/example.csv', encoding='utf-8')

In [28]:
csv_data = csv.reader(data)

### Reformat it into a python object list of lists

In [29]:
data_lines = list(csv_data)

### Working with the data
#### Headers

In [30]:
data_lines[0]

['id', 'first_name', 'last_name', 'email', 'gender', 'ip_address', 'city']

In [31]:
len(data_lines)

1001

#### Lines

In [32]:
for line in data_lines[:5]:
    print(line)

['id', 'first_name', 'last_name', 'email', 'gender', 'ip_address', 'city']
['1', 'Joseph', 'Zaniolini', 'jzaniolini0@simplemachines.org', 'Male', '163.168.68.132', 'Pedro Leopoldo']
['2', 'Freida', 'Drillingcourt', 'fdrillingcourt1@umich.edu', 'Female', '97.212.102.79', 'Buri']
['3', 'Nanni', 'Herity', 'nherity2@statcounter.com', 'Female', '145.151.178.98', 'Claver']
['4', 'Orazio', 'Frayling', 'ofrayling3@economist.com', 'Male', '25.199.143.143', 'Kungur']


#### Extract one row

In [33]:
data_lines[10]

['10',
 'Hyatt',
 'Gasquoine',
 'hgasquoine9@google.ru',
 'Male',
 '221.155.106.39',
 'Złoty Stok']

#### Extract a field from row

In [34]:
data_lines[10][3]

'hgasquoine9@google.ru'

#### Extract one column

In [38]:
all_emails = []

In [39]:
for line in data_lines[1:10]:
    all_emails.append(line[3])

In [40]:
all_emails

['jzaniolini0@simplemachines.org',
 'fdrillingcourt1@umich.edu',
 'nherity2@statcounter.com',
 'ofrayling3@economist.com',
 'jmurrison4@cbslocal.com',
 'lgamet5@list-manage.com',
 'dhowatt6@amazon.com',
 'kherion7@amazon.com',
 'chedworth8@china.com.cn']

##### With list comprehension

In [43]:
[ line[3] for line in data_lines[1:5] ]

['jzaniolini0@simplemachines.org',
 'fdrillingcourt1@umich.edu',
 'nherity2@statcounter.com',
 'ofrayling3@economist.com']

#### Putting columns together

In [44]:
full_names = []

In [45]:
for line in data_lines[1:6]:
    full_names.append(line[1]+" "+line[2])

In [46]:
full_names

['Joseph Zaniolini',
 'Freida Drillingcourt',
 'Nanni Herity',
 'Orazio Frayling',
 'Julianne Murrison']

### Writing a csv file

In [48]:
file_to_output = open('Practice_Files/to_save_file.csv', mode='w', newline='')

In [49]:
csv_writer = csv.writer(file_to_output, delimiter=',')

In [50]:
csv_writer.writerow(['a', 'b', 'c'])

7

In [51]:
csv_writer.writerows([['1', '2', '3'], ['4', '5', '6']])

In [52]:
file_to_output.close()

### Adding to a file

In [53]:
f = open('Practice_Files/to_save_file.csv', mode='a', newline='')

In [54]:
csv_writer = csv.writer(f)

In [56]:
csv_writer.writerow(['1', '2', '3']) # The output of this functions is the number of characters added

7

In [57]:
f.close()