# Quotes to table

I use this script to parse a text file of quotes into a table that I can copy-paste in a spreadsheet I keep with quotes from books. 

The text patters I use to delimit the quotes and comments are as such:
```
[Comments are in square brackets]
[Comment that end in a colon are in relation to a quote that follows:]
A quote starts with a new line and can have a page number at the end (optional). 123
Note that I keep quotes in one paragraph, so multiple paragraphs would be later parsed as multiple quotes. 123
```

I use the camera app to capture the text, and then paste it into the Notes app. I found that this has the most consistent experience on an iPhone (as opposed to e.g. using the OCR shortcut from inside an app). 

I already have a spreadsheet, so saving to the clipboard is most useful for me. You can export the Panda dataframe into a CSV or xls file if you'd like, of course. 

My spreadsheet has columns for book, comment, quote, page, codes, and notes. I use the codes column in my spreadsheet to tag topics, so I can sort related quotes across books later. 

In [106]:
import pandas as pd

def text2table(file, book_title, book_author):
    with open('quotes.txt', 'r') as file:
        paragraphs = file.read().split('\n\n')
        data = {'quote': paragraphs}
        df = pd.DataFrame(data)

    #clean up
    df['quote'] = df['quote'].apply(lambda x: x.strip())

    #parse comments
    for i in range(len(df)) :
        if df['quote'].iloc[i].endswith(":]"):
            df.at[i+1, 'comment'] = df['quote'].iloc[i]
        else:
            regex = 'r/'
            df['quote'].iloc[i]
    # drop comments duplicates
    df = df[~(df['quote'].str.endswith(":]") & df['comment'].isna())]

    # extract page numbers
    df['page'] = df['quote'].str.extract(r'(\d+)$')
    # remove page numbers from 'quote'
    df['quote'] = df['quote'].str.replace(r'(\d+)$', '')

    df['book'] = f'{book_title} ({book_author})'

    #pre export
    df.fillna('', inplace=True)
    df = df[['book', 'comment', 'quote', 'page']]

    return df

df = text2table(file = "quotes.txt", book_author = "Hannah Arendt", book_title = "The Human Condition")
df.to_clipboard(header=False, index=False)
# df.iloc[1:]
df
print("✅ Copied to clipboard")

✅ Copied to clipboard
