# Readme Manager

This notebook is used to control all of the `readme` files throughout my repository. Note that this project is currently `under documented`.

In [1]:
import pandas as pd

projects = pd.read_csv('assets/DataCampProjects.csv', index_col=0)
projects

Unnamed: 0_level_0,language,title,description,categories,guided,completed
Index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,Python,What and Where are the World's Oldest Businesses,Use joining techniques to discover the oldest ...,['Data Manipulation'],guided,0
1,R,Bad Passwords and the NIST Guidelines,Check what passwords fail to conform to the Na...,"['Data Manipulation', 'Importing & Cleaning Da...",guided,0
2,Python,Analyze Your Runkeeper Fitness Data,"Import, clean, and analyze seven years worth o...","['Data Manipulation', 'Data Visualization', 'P...",guided,0
3,R,Are You Ready for the Zombie Apocalypse?,Use your logistic regression skills to protect...,"['Data Manipulation', 'Data Visualization', 'P...",guided,0
4,Python,Real-time Insights from Social Media Data,Learn to analyze Twitter data and do a deep di...,"['Data Manipulation', 'Data Visualization', 'P...",guided,0
...,...,...,...,...,...,...
103,Python,Extract Stock Sentiment from News Headlines,Scrape news headlines for FB and TSLA then app...,"['Data Manipulation', 'Data Visualization', 'P...",guided,0
104,Python,Find Movie Similarity from Plot Summaries,Use NLP and clustering on movie plot summaries...,"['Data Manipulation', 'Data Visualization', 'M...",guided,0
105,R,Functions for Food Price Forecasts,Write functions to forecast time series of foo...,"['Data Manipulation', 'Data Visualization', 'I...",guided,0
106,SQL,Online News SQL Certification,Following the changes in working habits during...,"['Case Studies', 'Data Manipulation', 'Importi...",unguided,1


In [2]:
def BrainStation_string(depth): 
    return f"""## BrainStation Projects
![BrainStation Logo]({'../'*depth}assets/BrainStation_Primary_Logo.png)
This folder contains projects I completed while in the BrainStation Online Bootcamp (brainstation.io).

These projects showcase skills I have developed in Data Science, including:

- **Data Analysis in SQL**: *Bixi Project - Part 1*
- **Visual Analytics in Tableau**: *Bixi Project - Part 2*
- **Cleaning and EDA**: *Statistics & Public Health 1*
- **Data Analysis**: *Statistics & Public Health 2*
- **Exploratory Data Analysis & Data Wrangling**: *Natural Language Processing With Hotel Review Part 1*
- **Modeling**: *Natural Language Processing With Hotel Review Part 2*
- **Big Data Fundamentals**: *Big Data Wrangling With Google Books Ngrams*"""

In [3]:
def corn_string(depth):
    return f"""## Regression Models for Predicting Commodity Sales Prices
![image of corn]({'../'*depth}assets/corn_image.png)
This folder contains all of the files relevant to my BrainStation Capstone Project:
***Regression Models for Predicting Commodity Sales Prices***
by Daniel Mortensen

### Subfolder Layout
- Raw data is stored in the "./Data" subfolder.
- All dataframes generated during the importing, cleaning, exploring, or other analysis steps are stored in the "./DataFrames" subfolder.
- Requirements for setting up the different kernels used in this project are stored in the "./kernel requirements" subfolder.
- PDF printouts of the Jupyter notebooks used for this project are stored in the "./PDF Versions of Notebook Files" subfolder.

### Project Notebooks
- "Data Scrubbing.ipynb": used for scrubbing and combining corn, market, and climate data into unified dataframes and for engineering a baseline feature.
- "Data Visualization and Exploratory Analysis.ipynb": used for visualizing the various features and exploring their relevance to the target feature, "PRICE RECEIVED, MEASURED IN $ / BU".
- "Modeling.ipynb": used for testing various machine learning regression models for predicting the target feature, "PRICE RECEIVED, MEASURED IN $ / BU".

### References
- Corn data was taken from: https://quickstats.nass.usda.gov/
- Climate data was taken from: https://www.ncdc.noaa.gov/cag/national/time-series
- US Population data was taken from: https://www.multpl.com/united-states-population/table/by-year"""

In [4]:
completed_projects = projects[projects['completed'] == 1].copy()
completed_counts = completed_projects.groupby('language')['guided'].value_counts()

python_guided = completed_counts[('Python', 'guided')]
try:
    python_unguided = completed_counts[('Python', 'unguided')]
except:
    python_unguided = 0
SQL_guided = completed_counts[('SQL', 'guided')]
SQL_unguided = completed_counts[('SQL', 'unguided')]


In [5]:
python_cats_series = completed_projects[completed_projects['language'] == 'Python']['categories']
python_cats = []

for cat in python_cats_series:
    cat = cat.replace("['", "")
    cat = cat.replace("']", "")
    cat = cat.split("', '")
    for subcat in cat:
        python_cats.append(subcat)


python_cats_set = list(set(python_cats))
python_cats_set.sort()
python_cats_string = "\t<ins>These projects focus on the following topics</ins>:   \n"
for cat in python_cats_set:
    python_cats_string += "\t- " + cat + '   \n'


In [6]:
python_guided_cats = '<ins>Projects listed by category</ins>:   \n'
python_unguided_cats = '<ins>Projects listed by category</ins>:   \n'

for cat in python_cats_set:
    guided_projects = []
    unguided_projects = []
    for ind in completed_projects[(completed_projects['guided'] == 'guided') & (completed_projects['language'] == 'Python')].index:
        if cat in completed_projects.loc[ind, 'categories']:
            guided_projects.append(completed_projects.loc[ind, 'title'])
    if len(guided_projects):
        python_guided_cats += f'   {cat}   \n  '
        for project in guided_projects:
            python_guided_cats += '\t' + '- "' + project + '"   \n'
            
    for ind in completed_projects[(completed_projects['guided'] == 'unguided') & (completed_projects['language'] == 'Python')].index:
        if cat in completed_projects.loc[ind, 'categories']:
            unguided_projects.append(completed_projects.loc[ind, 'title'])
    if len(unguided_projects):
        python_unguided_cats += f'   {cat}   \n  '
        for project in unguided_projects:
            python_unguided_cats += '\t' + '- "' + project + '"   \n'


In [7]:
SQL_cats_series = completed_projects[completed_projects['language'] == 'SQL']['categories']
SQL_cats = []

for cat in SQL_cats_series:
    cat = cat.replace("['", "")
    cat = cat.replace("']", "")
    cat = cat.split("', '")
    for subcat in cat:
        SQL_cats.append(subcat)


SQL_cats_set = list(set(SQL_cats))
SQL_cats_set.sort()
SQL_cats_string = "\t<ins>These projects focus on the following topics</ins>:   \n"
for cat in SQL_cats_set:
    SQL_cats_string += "\t- " + cat + '   \n'


In [8]:
SQL_guided_cats = '<ins>Projects listed by category</ins>:   \n'
SQL_unguided_cats = '<ins>Projects listed by category</ins>:   \n'

for cat in SQL_cats_set:
    guided_projects = []
    unguided_projects = []
    for ind in completed_projects[(completed_projects['guided'] == 'guided') & (completed_projects['language'] == 'SQL')].index:
        if cat in completed_projects.loc[ind, 'categories']:
            guided_projects.append(completed_projects.loc[ind, 'title'])
    if len(guided_projects):
        SQL_guided_cats += f'   {cat}   \n  '
        for project in guided_projects:
            SQL_guided_cats += '\t' + '- "' + project + '"   \n'
            
    for ind in completed_projects[(completed_projects['guided'] == 'unguided') & (completed_projects['language'] == 'SQL')].index:
        if cat in completed_projects.loc[ind, 'categories']:
            unguided_projects.append(completed_projects.loc[ind, 'title'])
    if len(unguided_projects):
        SQL_unguided_cats += f'   {cat}   \n  '
        for project in unguided_projects:
            SQL_unguided_cats += '\t' + '- "' + project + '"   \n'
    

In [9]:
def logos_projects(depth):
    return "These projects showcase skills I have developed using:   \n   \n" + \
            "### Python   \n" + \
            f"![Python Logo]({'../'*depth}assets/python.png)    \n" + \
            f"\t- {python_guided} guided projects   \n" + \
            f"\t- {python_unguided} unguided projects   \n   \n" + \
            f"{python_cats_string}   \n   \n" + \
            "### SQL   \n" + \
            f"(SQL server)   \n" + \
            f"![SQL Logo]({'../'*depth}assets/SQL.png)   \n" + \
            f"\t- {SQL_guided} guided projects   \n" + \
            f"\t- {SQL_unguided} unguided projects   \n   \n" + \
            f"{SQL_cats_string}   \n"
    
def DC_logo(depth):
    return f"![DataCamp Logo]({'../'*depth}assets/datacamp.png)  \nThis folder contains projects I completed on DataCamp ([datacamp.com](datacamp.com))"

def python_logo(depth):
    return f"![Python Logo]({'../'*depth}assets/python.png)   \nThis folder contains projects I completed on DataCamp ([datacamp.com](datacamp.com)) using **Python**"

def SQL_logo(depth):
    return f"![SQL Logo]({'../'*depth}assets/SQL.png)   \nThis folder contains projects I completed on DataCamp ([datacamp.com](datacamp.com)) using **SQL**   \n(SQL server, in a Jupyter Notebook)"

guided_string = "- **Guided**: projects with detailed instructions and tests for checking results."
unguided_string = "- **Unguided**: projects with relatively few instructions."

guided_description = "These are **Guided** projects, with detailed instructions and tests for checking results."
unguided_description = "**Unguided Projects** come with relatively few instructions."

format_string = "Projects are provided in *.ipynb* (Jupyter notebook), *.html*, & *.pdf* formats."


In [10]:
dsp_string = f"""# Data-Science-Projects

This repository holds my data science projects, including:
- Various projects I completed on DataCamp (datacamp.com): **DataCamp Projects**
- Various projects I completed while in the BrainStation Online Bootcamp (brainstation.io): **BrainStation Projects**
- My main portfolio project: **Regression Models for Predicting Commodity Sales Prices**  
- Other personal projects I have worked on  
  
  
## DataCamp Projects
{DC_logo(0)}.  

{logos_projects(0)}
{guided_string}
{unguided_string}


{BrainStation_string(0)}


{corn_string(0)}

"""
print(dsp_string)

# Data-Science-Projects

This repository holds my data science projects, including:
- Various projects I completed on DataCamp (datacamp.com): **DataCamp Projects**
- Various projects I completed while in the BrainStation Online Bootcamp (brainstation.io): **BrainStation Projects**
- My main portfolio project: **Regression Models for Predicting Commodity Sales Prices**  
- Other personal projects I have worked on  
  
  
## DataCamp Projects
![DataCamp Logo](assets/datacamp.png)  
This folder contains projects I completed on DataCamp ([datacamp.com](datacamp.com)).  

These projects showcase skills I have developed using:   
   
### Python   
![Python Logo](assets/python.png)    
	- 9 guided projects   
	- 0 unguided projects   
   
	<ins>These projects focus on the following topics</ins>:   
	- Applied Finance   
	- Case Studies   
	- Data Manipulation   
	- Data Visualization   
	- Importing & Cleaning Data   
	- Machine Learning   
	- Probability & Statistics   
	- Programming   
  

In [11]:
dcp_string = f"""# DataCamp Projects
{DC_logo(1)}. 

{logos_projects(1)}
{format_string}
"""

In [12]:
pp_string = f"""# Python Projects (DataCamp)  
{python_logo(2)}

{guided_string}
    - {python_guided} guided projects

{python_guided_cats}


{unguided_string}
    - {python_unguided} unguided projects
    
{python_unguided_cats}

{format_string}
"""

In [13]:
gpp_string = f"""# Guided Python Projects (DataCamp)
{python_logo(3)}

{guided_description}

{python_guided_cats}

{format_string}
"""

In [14]:
upp_string = f"""# Unguided Python Projects (DataCamp)
{python_logo(3)}

{unguided_description}

{python_unguided_cats}

{format_string}
"""

In [15]:
sp_string = f"""# SQL Projects (DataCamp)  
{SQL_logo(2)}

{guided_string}
    - {SQL_guided} guided projects

{SQL_guided_cats}


{unguided_string}
    - {SQL_unguided} unguided projects
    
{SQL_unguided_cats}
"""

In [16]:
gsp_string = f"""# Guided SQL Projects (DataCamp)  
{SQL_logo(3)}

{guided_description}

{SQL_guided_cats}

{format_string}
"""

In [17]:
usp_string = f"""# Unguided SQL Projects (DataCamp)  
{SQL_logo(3)}

{unguided_description}

{SQL_unguided_cats}

{format_string}
"""

In [18]:
main_readme = '../../../Data-Science-Projects/README.md'
datacamp_readme = '../../../Data-Science-Projects/DataCamp Projects/README.md'
python_readme = '../../../Data-Science-Projects/DataCamp Projects/Python/README.md'
python_guided_readme = '../../../Data-Science-Projects/DataCamp Projects/Python/Guided/README.md'
python_unguided_readme = '../../../Data-Science-Projects/DataCamp Projects/Python/Unguided/README.md'
SQL_readme = '../../../Data-Science-Projects/DataCamp Projects/SQL/README.md'
SQL_guided_readme = '../../../Data-Science-Projects/DataCamp Projects/SQL/Guided/README.md'
SQL_unguided_readme = '../../../Data-Science-Projects/DataCamp Projects/SQL/Unguided/README.md'
BrainStation_readme = '../../../Data-Science-Projects/BrainStation Projects/README.md'
corn_readme = '../../../Data-Science-Projects/Regression Models for Predicting Commodity Sales Prices/README.md'

readmes = [
    main_readme, 
    datacamp_readme, 
    python_readme, 
    python_guided_readme, 
    python_unguided_readme, 
    SQL_readme, 
    SQL_guided_readme, 
    SQL_unguided_readme,
    BrainStation_readme,
    corn_readme
]

content_strings = [
    dsp_string,
    dcp_string,
    pp_string,
    gpp_string,
    upp_string,
    sp_string,
    gsp_string,
    usp_string,
    BrainStation_string(1),
    corn_string(1)
]

for i in range(len(readmes)):
    with open(readmes[i], 'w') as r:
        r.write(content_strings[i])