# Andrew Ingrassia |  *Data Analysis Career Exploration*

In [6]:
import pandas as pd
from matplotlib import pyplot as plt

df1 = pd.read_csv("DataAnalyst.csv")               # see "DF1 OVERVIEW" for description
df2 = pd.read_csv("data_survey.csv")               # see "DF2 OVERVIEW" for description
df3 = pd.read_csv("bls_projections.csv")           # see "DF3 OVERVIEW" for description

***

## **PROJECT GOALS**


1) Demonstrate a basic understanding of data analysis tools, concepts, and techniques


2) Convey my suitability for a career in data analysis


3) Describe the circumstances that led me to pursue a career in data analysis


4) Practice my newly acquired data analysis skills by putting them to use in the context of a real-world project of personal significance


5) Investigate data analysis as a personal career path with the following questions in mind:

- *How healthy is the data science job market?*
- *What data analysis tools/languages/skills are in high demand?*
- *What effect should I expect my lack of formal computer science education to have on my initial job search?*
- *What would be a reasonable salary expectation (short-term and long-term)?* 
- *What should I expect in terms of work/life balance as a data analyst?*

***

## **BACKGROUND**

1) PERSONAL HISTORY

I attended Lindenwood University in St. Charles, MO and obtained a bachelor's degree in "Missions and Social Justice" (a subset of Lindenwood's "Christian Ministry Studies" program) in 2011. At the time I was also a medic in the Missouri Army National Guard and worked part-time for a family member as a house painter. I pursued that particular degree with the intention of going on to seminary and eventually entering full-time ministry, but those plans changed near the end of my time at Lindenwood due to a significant shift in worldview. That being the case, I was uncertain about how to proceed with life after graduation. Life carried on regardless of my uncertainty, and I got married, had two children, and started a residential painting business where I carved out a comfortable niche for myself specializing in high-end interior painting and cabinet refinishing. Painting was a great career that paid well enough and afforded me plenty of my beloved autonomy and independence, but I became increasingly unsatisfied as time went on. I didn't feel challenged, and in 2019 I started exploring other options.


2) WHY I CHOSE COMPUTER SCIENCE

I have always been interested in computers, but had not seriously considered a career in computer science until after undergoing a battery of personality and career aptitude tests. Every assessment tool that I used indicated strongly that I ought to consider a career in computer science. This is due not only to fact that computer science happens to align with my personal interests, but also to the fact that I possess certain temperamental qualities and behavioral tendencies common among those who find themselves pursuing computer science related careers.
    

Here are some relevant snippets from my personality test results:

- *Tendency to analyze systems and ideas thoroughly to create deep understanding*
- *Tendency to enjoy designing creative solutions to highly abstract problems*
- *Tendency to enjoy addressing complex theoretical or technical problems with creative, novel solutions*
- *Tendency to choose careers that allow them to use their intellect, analyze concepts, and think deeply*
- *Tendency to enjoy working with ideas more than with people (tech/engineering = good | sales/nursing = bad)*
- *Tendency to be fascinated with logical analysis, systems, and design*
- *Logical, analytical, insightful, curious, independent, and objective*
- *Innovative by nature and often drawn to cutting-edge fields such as technology, engineering, and the sciences*


3) WHY I AM SELF-TAUGHT

After a great deal of consideration, I decided to begin pursuing a second bachelor's degree in computer science. Because of the nature of my first degree, I had not yet completed the math courses required for application. I therefore began taking one math class at a time at my local community college until I fulfilled the requirements for application to Oregon State University's online post-baccalaureate computer science program, which I began in January of 2020. I had heard of "self-taught programmers" before, but by this point in time I was insufficiently familiar with the territory and never seriously considered taking that route. It wasn't until my second semester in OSU's program that I began to become acquainted with the vast expanse of readily available, high quality, and inexpensive (or in many instances completely free) computer science related educational resources. I came to the realization that even though it would mean forgoing a second degree, I could actually learn more effectively by teaching myself than I could by sticking with OSU's program. So after careful consideration, I decided to drop out of OSU in my second semester and began learning independently - a decision that I viewed as conveying the following advantages:
    
- *I could spend significantly less money and incur no additional student loan debt*
- *I could learn at my own pace*
- *I could learn exclusively from materials that I found to be more useful and effective compared to the OSU curriculum*
- *I could focus more precisely on developing job-specific skills*
- *I could be job-ready in less time*
- *I could have a more flexible schedule while I learned, freeing me up to watch my kids when needed (to save on daycare costs) as well as to pick up side work as a painter*
    

4) WHY I CHOSE DATA ANALYSIS

I have felt drawn toward data science ever since I began thinking about switching careers. It strikes me as a field that is tailor-made for people with my temperament. I initially assumed that pursuing data science in any form was off the table because doing so would entail taking so many prerequisite math courses that I wouldn't be able to obtain a degree in a reasonable ammount of time. But I eventually came to understand that under the umbrella of “data science” exists a wide spectrum of responsibilities and associated educational requirements. I was thrilled to learn that a career in data analysis was actually a practical, achievable goal. Since then I have enjoyed learning to work with large datasets using Python in conjunction with tools like Pandas, NumPy, Matplotlib, and Jupyter Notebooks. I feel as though I have found a niche into which my talents, personality, and interests fit perfectly, and I am excited to begin putting my newly acquired skills to work in the real world.


5) THE DATA

Throughout the remainder of this project, I will be analyzing three datasets with the goal of using the resulting insights to answer questions posed in the "PROJECT GOALS" section. What follows, then, is a brief description of each of the datasets, followed by my efforts to answer each of the questions in turn.

***

## **DF1 OVERVIEW**: "Data Analyst Jobs"


*Dataset Link*: https://www.kaggle.com/andrewmvd/data-analyst-jobs

This dataset contains information related to *data analysis* as a career. The data came from over 2,000 data analysis job listings scraped from glassdoor.com, and is designed to help job seekers:

- *Find the best jobs by salary and company rating*
- *Explore the skills required in job descriptions*
- *Predict salary based on industry, location, and company revenue*

***

## **DF2 OVERVIEW**: "Salary and More - Data Scientist, Analyst, Engineer"


*Dataset Link*: https://www.kaggle.com/phuchuynguyen/salary-and-moredata-scientist-analyst-engineer

This dataset is processed from the annual "Stack Overflow Annual Developers Survey" and is designed to help evaluate the impact of different variables on expected salary (for people who work in fields related to data science). 

Variables include:

- *Country*
- *Education level*
- *Employment*
- *Job satisfaction*
- *Organization size*
- *Undergraduate major*
- *Number of years coding professionally*

***

## **DF3 OVERVIEW**: "Bureau of Labor Statistics - Occupational Projections Data"


*Dataset Link*: https://data.bls.gov/projections/occupationProj | *Full Dataset Explanation*: https://www.bls.gov/emp/documentation/nem-definitions.htm


This dataset comes from the US Bureau of Labor Statistics and presents occupational data related to:

- *Historical and projected employment by occupation (2019 - 2029)*
- *Projections of separations from occupations that will result in openings for new workers (2019 - 2029)*
- *Typical education, experience, and training requirements for each occupation*
- *Average annual occupational openings (2019 - 2029)*
- *Median annual wages (as of 2020)*

***

### **NEXT SECTION** | QUESTION 1: *How healthy is the data science job market?*