# getting started working with csv data in Python, 001
...continuted, after some time
## resources
- [csv - CSV File Reading and Writing](https://docs.python.org/3.8/library/csv.html?highlight=csv#module-csv)

In [1]:
import csv

In [None]:
# just basic csv.reader stuff 
# some of these args could be dispensed with?
with open('testdata/csv_test_001.csv', encoding = 'utf8', errors = 'ignore', newline = '') as file_in:
    reader = csv.reader(file_in)
    for row in reader:
        print(row)

## note
- `\ufeff`: See [u'\ufeff' in Python string](https://stackoverflow.com/questions/17912307/u-ufeff-in-python-string)

In [None]:
# I shall try it a different way
with open('testdata/csv_test_001.csv', encoding = 'utf8', errors = 'ignore', newline = '') as file_in:
    reader = csv.reader(file_in)
    for row in reader:
        for value in row:
            if value == '':
                pass
            else:
                print(value)
# I *can* access a value within a row!

In [10]:
with open('testdata/csv_test_001.csv', encoding = 'utf8', errors = 'ignore', newline = '') as file_in:
    reader = csv.reader(file_in)
    rowcount = 0
    for row in reader:
        if rowcount == 0:
            print(f"first row / HEADERS: {' / '.join(row)}")
            rowcount += 1
        elif rowcount > 0:
            # (see aside [1]
            print(f"row number {rowcount + 1}")
            for value in row:
                  print(f"column {row.index(value) + 1} / value = {value}")
            rowcount += 1
        else:
            print("ERROR ERROR ERROR...")
# aside [1]: 
    # it seems that the reader object should be iterable, and I should be able to get a row number from
    # an index or something other than my rowcount)

first row / HEADERS: ﻿id / string / attribute / value / comment
row number 2
column 1 / value = 1
column 2 / value = יל בוקייטו די רומאנסאס : קאנטיקאס ריקוז''יד'אס, פארה נוב'יאס אי פאריד
column 3 / value = color
column 4 / value = red
column 5 / value = full row
row number 3
column 1 / value = 2
column 2 / value = ピザ
column 3 / value = cost
column 4 / value = cheap
column 5 / value = full row
row number 4
column 1 / value = 
column 1 / value = 
column 1 / value = 
column 1 / value = 
column 1 / value = 
row number 5
column 1 / value = 3
column 2 / value = hello
column 3 / value = 
column 3 / value = 
column 5 / value = blank cells
row number 6
column 1 / value = 4
column 2 / value = bateau à voile
column 3 / value = English
column 4 / value = sailboat
column 5 / value = full row


## notes
- Accessing the index of each cell in a row is good, it's very important, I believe that I'll need this to retrieve the matching header for each cell value
- Next I believe that I'll create a dictionary containing the headers and match cell values to headers

In [3]:
with open('testdata/csv_test_001.csv', encoding = 'utf-8-sig', errors = 'ignore', newline = '') as file_in:
    reader = csv.reader(file_in)
    rowcount = 0
    headers = {}
    for row in reader:
        if rowcount == 0:
            for cell in row:
                header = {row.index(cell): cell}
                headers.update(header)
            print("updated headers dict")
            rowcount += 1
        else:
            pass
    print(headers)
    

updated headers dict
{0: 'id', 1: 'string', 2: 'attribute', 3: 'value', 4: 'comment'}


## notes
- Okay the `/ufeff` is causing problems here, I will have to deal with it
- 🎉 super-very-easy solution > use arg `encoding = 'utf-8-sig'` in the `open()` function (see [open() (for Python 3.8)](https://docs.python.org/3.8/library/functions.html#open))
- Note that [the StackOverflow post](https://stackoverflow.com/questions/17912307/u-ufeff-in-python-string) also includes some more in-depth information beyond the "easy solution"

## 💥here's where it breaks down💥

In [18]:
with open('testdata/csv_test_001.csv', encoding = 'utf-8-sig', errors = 'ignore', newline = '') as file_in:
    reader = csv.reader(file_in)
    rowcount = 0
    headers = {}
    for row in reader:
        with open('testdata/csv_test_001_out.csv', mode = 'w') as file_out:
            writer = csv.writer(file_out, delimiter = ',', quotechar = '"', quoting = csv.QUOTE_MINIMAL)
            if rowcount == 0:
                for cell in row:
                    header = {row.index(cell): cell}
                    headers.update(header)
            rowcount += 1
            if rowcount > 0:
                rowvalues = []
                for cell in row:
                    if row.index(cell) == 0:
                        pass
                    else:
                        rowvalues.append(f"{row.index(0)}")
                        rowvalues.append(cell)
                        rowvalues.append(f"{headers[cell]}")
                        writer.writerow(rowvalues)
print(headers)            

ValueError: 0 is not in list