<font color = blue>TITLE</font>:  Parse Google doc containing uplifting quotes into a Pandas dataframe with columns "Quote" and "Author_Comment".

<font color = blue>DESCRIPTION</font>:  This project was at the request of an associate who wanted the quotes I've collected in tabular form so they could be used in a Web Project.  The steps in this project were:

1. Download the quotes document from Google Drive to .docx format
2. Copy the document
*In this Jupyter Notebook*
3. Read in clipboard
4. Split quote number from quote/author/comment using regex and .split
5. Split quote from author/comment using regex and .split
6. Extract author/comment from quote/author/comment using regex and .extract
7. Review rows, correcting errors identified in the Dataframe in the downloaded document from step 1
8. Repeat steps 1- 7 until all errors eliminated
9. Remove unnecessary columns, leaving "Quote" and "Author_Comment"
10. Write Dataframe to a .csv file

In [137]:
import pandas as pd
import re

In [138]:
#read in the quotes
gq_df = pd.read_clipboard(sep=r'\n',header = None)

In [139]:
#set display large enough to see the quotes
pd.set_option('display.max_colwidth', -1)
pd.set_option('display.max_rows', 1000)
gq_df.head()

Unnamed: 0,0
0,4. It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant- Peter Drucker
1,5. You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself- Sam Levenson.
2,"6. If a problem causes many meetings, the meetings eventually become more important than the problem. -Hendrickson’s Law"
3,7. Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake- Peter Drucker
4,"8. Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have- Ann McGee Cooper."


In [140]:
#split the column at the quote number
gq_df = gq_df[0].str.split(pat = '.', n = 1, expand = True)
gq_df.head()

Unnamed: 0,0,1
0,4,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant- Peter Drucker
1,5,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself- Sam Levenson.
2,6,"If a problem causes many meetings, the meetings eventually become more important than the problem. -Hendrickson’s Law"
3,7,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake- Peter Drucker
4,8,"Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have- Ann McGee Cooper."


In [141]:
#rename columns
gq_df.rename(columns = {0: 'Quote_number', 1: 'Quote_Author_Comment'}, inplace = True)
gq_df.head()

Unnamed: 0,Quote_number,Quote_Author_Comment
0,4,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant- Peter Drucker
1,5,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself- Sam Levenson.
2,6,"If a problem causes many meetings, the meetings eventually become more important than the problem. -Hendrickson’s Law"
3,7,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake- Peter Drucker
4,8,"Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have- Ann McGee Cooper."


In [142]:
#Make two new columns, Removing the author and comment from the quote
gq_df[['Quote','Author_Comment']] = gq_df['Quote_Author_Comment'].str.split(pat = r'-.+$', n = 1,expand = True)
gq_df.head()

Unnamed: 0,Quote_number,Quote_Author_Comment,Quote,Author_Comment
0,4,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant- Peter Drucker,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant,
1,5,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself- Sam Levenson.,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself,
2,6,"If a problem causes many meetings, the meetings eventually become more important than the problem. -Hendrickson’s Law","If a problem causes many meetings, the meetings eventually become more important than the problem.",
3,7,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake- Peter Drucker,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake,
4,8,"Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have- Ann McGee Cooper.","Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have",


In [143]:
#extract the author and comment and populate the Author_Comment column
gq_df['Author_Comment'] = gq_df['Quote_Author_Comment'].str.extract(pat = r'(-.+)') 

In [144]:
#review Dataframe for errors
gq_df

Unnamed: 0,Quote_number,Quote_Author_Comment,Quote,Author_Comment
0,4,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant- Peter Drucker,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant,- Peter Drucker
1,5,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself- Sam Levenson.,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself,- Sam Levenson.
2,6,"If a problem causes many meetings, the meetings eventually become more important than the problem. -Hendrickson’s Law","If a problem causes many meetings, the meetings eventually become more important than the problem.",-Hendrickson’s Law
3,7,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake- Peter Drucker,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake,- Peter Drucker
4,8,"Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have- Ann McGee Cooper.","Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have",- Ann McGee Cooper.
5,9,"We do not allow ourselves the facile, rather theatrical declaration that this moment in which we exist is one of total perdition, in the abyss of darkness, or a triumph of daybreak, etc. It is a time like any other, a time which is never quite like any other- Michel Foucault","We do not allow ourselves the facile, rather theatrical declaration that this moment in which we exist is one of total perdition, in the abyss of darkness, or a triumph of daybreak, etc. It is a time like any other, a time which is never quite like any other",- Michel Foucault
6,10,"To venture causes anxiety, but not to venture is to lose one’s self. And to venture in the highest sense is precisely to become conscious of one’s self- Soren Kierkegaard.","To venture causes anxiety, but not to venture is to lose one’s self. And to venture in the highest sense is precisely to become conscious of one’s self",- Soren Kierkegaard.
7,11,The leader must have infectious optimism…The final test of a leader is the feeling you have when you leave his presence after a conference. Have you a feeling of uplift and confidence?- Field General Bernard Montgomery.,The leader must have infectious optimism…The final test of a leader is the feeling you have when you leave his presence after a conference. Have you a feeling of uplift and confidence?,- Field General Bernard Montgomery.
8,12,It is more convenient to assume that reality is similar to our preconceived ideas than to freshly observe what we have before our eyes- Robert Fritz.,It is more convenient to assume that reality is similar to our preconceived ideas than to freshly observe what we have before our eyes,- Robert Fritz.
9,13,The single largest factor in winning: Winners do what losers are unwilling to do,The single largest factor in winning: Winners do what losers are unwilling to do,


In [145]:
#drop unneeded columns
gq_df.drop(['Quote_number', 'Quote_Author_Comment'], axis = 1, inplace =True) 
gq_df.head()

Unnamed: 0,Quote,Author_Comment
0,It is the leaders abiding challenge to make peoples strengths effective and their weaknesses irrelevant,- Peter Drucker
1,You must learn from the mistakes of others. You can’t possibly live long enough to make them all yourself,- Sam Levenson.
2,"If a problem causes many meetings, the meetings eventually become more important than the problem.",-Hendrickson’s Law
3,Executives spend more time on managing people and making people decisions than on anything else. No other decisions are so long lasting in their consequences or so difficult to unmake,- Peter Drucker
4,"Never believe that a few caring people can’t change the world. For indeed, that’s all who ever have",- Ann McGee Cooper.


In [146]:
#write to .csv file
gq_df.to_csv(r'C:\Users\drrdm\Good quotes\Good_quotes_upload.csv')