# getting started working with csv data in Python

- looking at all this stuff for the use case described in [tabular_data_and_python3_2021-12.md](https://gist.github.com/briesenberg07/1d3bdc9d079a8581768088dea1111f6b#using-python-3-to-transform-tabular-data-from-format-one-to-format-two)
- much of testing here is copied from the Real Python > [Reading and Writing CSV Files in Python](https://realpython.com/python-csv/) tutorial, [Parsing CSV Files With Python’s Built-in CSV Library](https://realpython.com/python-csv/#parsing-csv-files-with-pythons-built-in-csv-library)

In [1]:
# do this before running any of the cells below!
import csv

In [None]:
# use the csv module's reader function
# adapted from the tutorial

with open('ssdc_test_text_001.csv') as file:
    reader = csv.reader(file, delimiter=",")
    line_count = 0
    for row in reader:
        if line_count == 0:
            # don't understand {' /'.join(row)}
            print(f"Here are the headers, separated by slashes:\n{' / '.join(row)}")
            # ah, I see
                # per w3schools - string*.join(iterable) [string* will be used as a separator]
                # per docs.python.org/3.8 - str.join(iterable) - Return a string which is the concatenation of the strings in iterable. A TypeError will be raised if there are any non-string values in iterable, including bytes objects. The separator between elements is the string providing this method.
            line_count +=1
        else:
            pass

In [None]:
# use the csv module's DictReader class
# adapted from the tutorial

with open('ssdc_test_text_001.csv') as file:
    reader = csv.DictReader(file) # skip the delimiter, which is "," by default?
    line_count = 0
    for row in reader:
        if line_count == 0:
            print("Let's see whether I can get some headers by position...")
            print(f"the fifth column name is {row[4]}, and the tenth is {row[9]}.")
            print("Wonder if that worked...")
            line_count += 1
        else:
            pass                
    


In [None]:
# It *didn't* work
    # If I had to guess I'd say that as written, I need to refer to column positions by name and not index, so
    
with open('ssdc_test_text_001.csv') as file:
    reader = csv.DictReader(file) # skip the delimiter, which is "," by default?
    line_count = 0
    for row in reader:
        if line_count == 0:
            print("Let's get values using column headings instead?")
            line_count += 1
        elif line_count > 0 and line_count < 5:
            print(f"This row's title is {row['Title']}.")
            line_count += 1
        else:
            pass
    print("And so forth...")


For [my use case](https://gist.github.com/briesenberg07/1d3bdc9d079a8581768088dea1111f6b#use-case), I need to access data by cell.  
Or, more accurately, I'll need to:
- access each cell value in a row, one at a time--skipping blank cells--and for each of these cell values:
    - fetch the column header for that value
    - fetch a specific value from the row (value under the 'cdmnumber' header)
So, perhaps the `reader` function can do what I need??


In [None]:
# starting here with the reading...I'll figure out the writing next...
    # accessing the headers

with open('testdata/ssdc_test_text_001.csv') as file:
    reader = csv.reader(file)
    column_number = 0
    row_number = 0
    for row in reader:
        if row_number == 0:
            for cell in row:
                print(f"This header is {row[column_number]}.")
                column_number += 1
            row_number += 1
        else:
            pass

In [None]:
# starting here with the reading...I'll figure out the writing next...
    # put headers in a list to access later using column_number as index

with open('testdata/ssdc_test_text_001.csv') as file:
    reader = csv.reader(file)
    column_number = 0
    row_number = 0
    headers = []
    for row in reader:
        if row_number == 0:
            for cell in row:
                headers.append(row[column_number])
                column_number += 1
            row_number += 1
        else:
            pass


In [None]:
# lemme see if I did what I think I did...
print(headers)
print(row_number)

In [2]:
# can I write from what I'm reading??

with open('testdata/ssdc_test_text_001.csv') as file:
    reader = csv.reader(file)
    row_number = 0
    column_number = 0
    headers = []
    for row in reader:
        if row_number == 0:
            for cell in row:
                headers.append(row[column_number])
                column_number += 1
            row_number += 1
        else:
            pass
        
    def create_row():
        
        
    with open('testdata/ssdc_for_upload_001.csv', mode = 'w') as file_out: # don't forget write mode!
        writer = csv.writer(file_out, delimiter = ',', quotechar = '"', quoting = csv.QUOTE_MINIMAL)
        row_number = 0
        column_number = 0
        for row in reader:
            if row_number == 0:
                pass
            if row_number > 0 and row_number < 5:
                writer.writerow({{headers[column_number]} : {row[column_number]}})
                row_number += 1
            else:
                pass
