Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

List of interesting visualizations #41

Open
3 of 21 tasks
SamAI-Software opened this issue May 10, 2016 · 5 comments
Open
3 of 21 tasks

List of interesting visualizations #41

SamAI-Software opened this issue May 10, 2016 · 5 comments

Comments

@SamAI-Software
Copy link
Member

SamAI-Software commented May 10, 2016

This issue is for project control purpose and it will be constantly updated.
Latest website preview is here.
Please, feel free to add some interesting visualizations.
If you want to participate, you can find data here and questionnaire here.
The goal is to create D3.js visualizations for all topics from this article and for some facts from this list.
If you have any questions about data, you can ask them at issue #26.
Leave your feedback and ideas about the next survey at issue #39.

The list of interesting visualizations:

Demographics

Socials

Education & Experience

Current job

Future job


@SamAI-Software
Copy link
Member Author

SamAI-Software commented May 13, 2016

Please, drop here a comment, when you start working on a new visualization, so it will be assigned to you to avoid duplicates.

!Important

As we are all working together on one project, please use recommended break points for consistency, unless you have some special approach and you want to use your own break points for that.
If you have any questions, please write a comment in this issue, or ask directly @evaristoc or @SamAI-Software at FCC Data Science chat room.

The list or recommended groups:

  • Age
    • under 25
    • 25-29
    • 30-39
    • over 39 (40+)
  • Months Programming
    • <1 year (0-11 months)
    • 1-5 years (12-59 months)
    • 5+ years (60+ months)
  • Hours Learning
    • 0-9 hours
    • 10-29 hours
    • 30+ hours

Example:

example

@evaristoc
Copy link
Collaborator

@SamAI-Software
We need to agree and recommend the breaks of interval/ratio measurements (i.e. age, time, money, etc). We need to have some consistency in the way that data will be presented, if any.

@SamAI-Software
Copy link
Member Author

@evaristoc, very good point! 👍
I was also thinking about consistency in break points and colors.
I'll try to write a draft version asap today or tomorrow. And of course, feel free to suggest your system.

One question about your Podcast viz - why did you use 11+ months rather than 12+ months (1 year and more)?

@krisgesling
Copy link
Contributor

On age I've temporarily gone with 0-21, 22-25, 26-29, 30-33, 34+
primarily because it looked nice = 0 statistical validity 😃
codepen.io/krisgesling/pen/GZwYKV
Note: Global stats is currently broken while I switch it to change for each tab. The map is also re-plotted everytime you select a tab which is silly and visually jarring so going to switch to d3 transforms instead.

@SamAI-Software
Copy link
Member Author

@krisgesling wow, great viz so far! 💯

Consistency in break points is more for bar charts, so don't worry about it, because you have a different approach - to show new visitors the Respondent's Profile by country in a simple and understandable way.

Your visualization would be probably the first one on the page, so it will be the beginning of a story, if @QuincyLarson won't mind.

Once users land on the page and see your viz, they should understand straight away:

  • (basic) in which country all these respondents live in;
  • (basic) what's their gender by country;
  • (basic) what's the average age in each country;
  • (special) how many respondents are ethnic minority.

And to do that you can find your own break points that will give the best division, so users will see the difference between countries and understand our story.


Country of living.

I see that you changed groups for the first map - all.

kris_all

And that's great, because now we have a good picture that shows that vast majority of respondents live in USA and India, but also there are many in Europe, North America, etc. 👍


Gender by country.

But now let's look at the gender map.

kris_gender

What can we see here?

  • North Korea, Libya, Mozambique, Lesotho, Belize and Armenia are the top modern countries with high rate of educated women, where most coders are female, trans or agender.
  • Ethiopia, Niger and Zambia have more female coding learners than USA.
  • All coders in North Korea are trans, agender & genderqueer.

kris_nk

That's probably not kind of story we want to show...
The reason behind these miss-leading facts is statistical dispersion / deviation. (@evaristoc or @erictleung can correct me with a proper English term for this)
It's a well-known problem in statistic, so in every experiment researchers always set a minimum amount of cases (events) inside each group to avoid weird correlations. I'm sure that you already know that.The most famous example is coin-tossing. If you toss it 10 times, then you can get heads 8 times, which will lead you to a conclusion that odds are 4:1, while if you toss a coin for 10 000 times you would more likely to get 1:1 result.
A few years ago we conducted an A/B testing with a very small conversion rate, so even 20 000 users didn't give us enough events to see the real result, until we put more than 100 000 new users in each group.

So what to do?

Minimum size of a group.

The solution can be very easy. For example, you can set a minimum amount of respondents for the country to be colored on the map, otherwise in will stay white.
The good practice is to set the minimum size of a group to at least 100 people, but we don't have many respondents this year, so you can also set it to 50 or even to 20, if you feel that the result will give us a realistic picture.

Groups.

After setting the minimum size your previous groups won't give us a nice picture, so you can find a story that you want to tell and show it with a map.
Here are some hints for you, that I find interesting:

  • China (25%) & Philippines (26%) have more female coding learners than Canada (22%) & Australia (19%);
  • Russia (16%) & Ukraine (19%) have more female respondents than UK (15%), Germany (13%) and France (11%);
  • Turkey (8,8%) has less female coders than Nigeria (12%), Egypt (11%) & South Africa (11%);
  • Mexico (6,5%) has less female coding learners than most Muslims countries like Egypt (11%), Indonesia (11%) & Turkey (8,8%)
  • Most countries of Central and South America have very low amount of female respondents (10% or less);
  • (Note) if you will set minimum size as 20, then be careful with South Korea (38%).

So my suggestions would be something like this:

25+% - USA, China, Philippines, South Korea (if min.size = 20);
20-24% - Canada;
15-19% - Ukraine, Russia, UK, Portugal, Australia, Malaysia...;
10-14% - Germany, France, Finland, Sweden, Spain, Indonesia, Nigeria, Egypt, South Africa, Brazil...;
0-9 - Mexico, Colombia, Venezuela, Vietnam, etc.


Ok, I think now you have some new ideas for your gender map and also for age & ethnic minority maps!
Feel free to choose any path you like and good luck! :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants