Skip to content

Commit

Permalink
linking for 5 also done - section 6 left only
Browse files Browse the repository at this point in the history
  • Loading branch information
sanskritilabroo committed May 17, 2024
1 parent 25d1c28 commit 59b67f3
Showing 1 changed file with 17 additions and 12 deletions.
29 changes: 17 additions & 12 deletions Learn.md
Original file line number Diff line number Diff line change
Expand Up @@ -305,19 +305,20 @@ All the null values were handled for all three data sets and ensured the dataset
| YearsCodePro | 18112 | 0 |
| JobSeek | 2153 | 0 |

# <a name="5 Data Analysis and Visualization">Data Analysis and Visualization</a>
<h1 id="5 Data Analysis and Visualization">Data Analysis and Visualization</h1>

After cleaning and handling outliers in all three datasets, we started looking for valuable insights that we can draw from it.

<img src="https://recodehive.com/wp-content/uploads/2021/05/Message-from-Founder-1024x576.jpg">

## <a name="5.1 Distribution of respondents based on country">5.1) Distribution of respondents based on country</a>
<h2 id="5.1 Distribution of respondents based on country">5.1) Distribution of respondents based on country</h2>

We made use of `plotly` to create a Geoplot showing where the respondents are from and how it's been distributed around the world. We found that most of the respondents are from America.India is in the second position in terms of the number of respondents.

<img src="Data/Images/Geo plot.png">

## <a name="5.2 Impact of participation rate due to different ethnicity">5.2) Impact of participation rate due to different ethnicity</a>

<h2 id="5.2 Impact of participation rate due to different ethnicity">5.2) Impact of participation rate due to different ethnicity</h2>

Consistent with data in all three years, We found that `white or european descent` has the highest participation rate overall.

Expand All @@ -336,29 +337,29 @@ for i, v in enumerate(count):

<img src="Data/Images/Ethnicity vs participation.png">

## <a name="5.3 Most popular programming language in three years">5.3) Most popular programming language in three years</a>
<h2 id="5.3 Most popular programming language in three years">5.3) Most popular programming language in three years</h2>

The most popular language that developers worked on between 2018 to 2020 is JavaScript(14%). The second and third highest working language is HTML/CSS(13%) and SQL(11%). JavaScript and SQL had the same steady increasing trend over the three years. The percentage of HTML/CSS was slightly increased from 2018 to 2019. However, it dropped to the same level as 2018 in 2020. Python was responsible for about 9% in 2018. After then, it decreased to 8% in 2019 and it rose 1% in 2020.

There are some languages that were in only 2019; Elixir, Clojure, F#, Web assembly, and Erlang. Perl, Haskell, Julia were in the 2019 and 2020 with small percentages.

<img src="Data/Images/popular language distribution.png">

## <a name="5.4 Distribution of developers based on their developer role">5.4) Distribution of developers based on their developer role</a>

<h2 id="5.4 Distribution of developers based on their developer role">5.4) Distribution of developers based on their developer role</h2>

Most of the respondents were either back-end or full-stack developers.  For those who are working as marketing and sales professionals, their percentage is lowest compare to others.

<img src="Data/Images/devtype distribution.png">



## <a name="5.5 Distribution of respondents based on age">5.5) Distribution of respondents based on age</a>
<h2 id="5.5 Distribution of respondents based on age">5.5) Distribution of respondents based on age</h2>

Most of the respondents are in the age range 25-29. This shows that most of the responents are those who recently joined the comapanies or those who have less than 5 years of experience.

<img src="Data/Images/age distribution.png">

## <a name="5.6 Salary distribution of top ten countries">5.6) Salary distribution of top ten countries</a>
<h2 id="5.6 Salary distribution of top ten countries">5.6) Salary distribution of top ten countries</h2>

Overall, the country which has the highest mean annual salary is the United States of America($240,000) Dollars. The second highest country which provides mean salary is Australia($164,926) Dollars. Though India has a higher number of respondents, it has the lowest mean salary of $25,213 which shows that mean salary of developed country is much higher than the developing countries.

Expand All @@ -384,29 +385,33 @@ plt.show()

<img src="Data/Images/salary top ten countries.png">

## <a name="5.7 Analysis of impact of education on salary">5.7) Analysis of impact of education on salary</a>

<h2 id="5.7 Analysis of impact of education on salary">5.7) Analysis of impact of education on salary</h2>

The respondents who have done Doctorate have the highest mean salary among all other education levels. Secondly, the respondents who have done Bachelors degree has more salary than that of Masters degree holders. This may be due to years of professional coding experience and due to the higher number of respondents in that category than that of Masters degree(No of respondents in Bachelor degree is 35659 and number of respondents in masters degree is 16940)

What is interesting is that the respondents who do not have any degree have a mean salary of $90k. This shows the improvement in online learning and advancement of technology that is shifting the company from relying on University degrees.

<img src="Data/Images/salary on edlevel.png">

## <a name="5.8 Gender distribution among top five countries in 2019">5.8) Gender distribution among top five countries in 2019</a>

<h2 id="5.8 Gender distribution among top five countries in 2019">5.8) Gender distribution among top five countries in 2019</h2>

Based on the top 5 countries where the respondents have given the survey, we categorized male and female respondents in those countries.

In terms of male and female statistics, it can be realized that the US has the relatively largest female percentage at about 10.9% followed by Canada and UK at 9.6% and 8.0% respectively. Female respondents were around 5% in India and Germany which is the least among the top 5 counties.

<img src="Data/Images/gender distribution top 5.png">

## <a name="5.9 Where most data scientist came from in 2019?">5.9) Where most data scientist came from in 2019?</a>
<h2 id="5.9 Where most data scientist came from in 2019?">5.9) Where most data scientist came from in 2019?</h2>

There are 5,788 data scientists who responded to the Stackoverflow survey in `2019`. Most data scientists are from the US with 1,550 people and it is 3 times higher than data scientists from India. Followed by Germany and the UK with 427 and 339 people respectively. The rest are Canada, France, Netherlands, Brazil, Russia, and Australia which have less than 200 data scientists.

<img src="Data/Images/DS_top contries.png">

## <a name="5.10 Countries which pays the most for data scientist in 2019">5.10) Countries which pays the most for data scientist in 2019</a>

<h2 id="5.10 Countries which pays the most for data scientist in 2019">5.10) Countries which pays the most for data scientist in 2019</h2>


In 2019, the top three countries which have a highest mean annual salary of a data scientist are Ireland (`$275,851`), Luxembourg (​`$272,769`), and the USA (`$265,211`). Apart from that, the mean salary of the rest countries is less than (`$200,000`) per year. Japan provides the highest mean annual salary among Asian countries (`$118,969`)

Expand Down

0 comments on commit 59b67f3

Please sign in to comment.