Skip to content

Case study based on dataset created by Ian K. Analysis using Excel pivot tables and pivot charts.

Notifications You must be signed in to change notification settings

Ags-S/Ian-K-Aliens-in-America-study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Aliens in America case study

Agnieszka Karolina Szewczyk



SCENARIO

You play a role as a newly hired Data Analyst for a pharmaceutical company.
It's the year 2022 and aliens are well known to be living amongst us. Unfortunately, some of the aliens are a bit... too alien... and would like to fit into society a bit more.
So, it's up to you to find the best state(s) we should market our new prescription.
It would be helpful to know If these aliens are hostile, Their diet, Their age. It’s up to you to clean up the data and report back.


ASK

1. Business task:

To analyse the data about Aliens living in America and find a way to market ne prescription for hostile Aliens wanting to better assimilate.

2. Questions for the analysis:

  • Is there any correlation between key characteristics and the alien hostality?
  • What is the distribution of hostile and non-hostile aliens in the whole population?

PREPARE

  • The data was prepared by Ian Klosowicz, was published on Kaggle and contains three csv files:
    aliens.csv, contains the name, gender, age, type, contact information of the aliens.
    details.csv, contains the favourite food, feeding frequency, and aggressivity of the aliens.
    location.csv, contains the city, state, country, and job of the aliens.

  • ROCCC analysis

    • Reliability : HIGH – dataset has a moderately sized alien population and comes from reliable source.
    • Originality : HIGH
    • Comprehensive : HIGH – There are only 2 files, the data is not repeating and the dataset not exceeding the limit of cells and columns taken by Excel
    • Current : HIGH – Data is up to date. Cited: HIGH – data collector and source is properly documented..
    • Cited : HIGH – data collector and source is properly documented.

    PROCESS

    Data selection & Cleaning


    Firstly, I have merged all 3 files into 1 spreadsheet for easier access

    • I have created new tab with combined information that I will need by using VLOOKUP function and copied it onto new workbook.
    • Data was checked for the duplicates and empty values.
    • I have created multiple pivot tables with information about hostile aliens living in America for easier interpretation of the data..
    • The information obtained by performing the 2 last steps was transferred to separate spreadsheet.

    49% of aliens in USA are hostile.
    Firstly, I had a look at population of aliens categorised by their type and the hostility distribution to see if there is correlation
    The distribution between hostile and non-hostile aliens of each type is pretty much even. There is also no access to how the population was changing across the years to forecast how these values may grow in the future.




    By looking at the gender distribution I can see the leading genders among aliens are by far Female and male, but the spread on hostile and non-hostile in all groups is nearly even.



    I had a look at distribution grouped by age range and once again there was no leading group to state correlation between age and hostility. The most aggressive aliens are amongst 300-350 years range with 8.45 % out of 16.80 % of population belonging to this age range.



    I had no luck in finding obvious correlation in preference of feeding frequency and hostality in comparison to non-aggressive aliens. Although most of aggressive individuals tend to feed once or never while the not aggressive ones lean towards once or daily feeding frequency.



    SHARE


    After collecting and summarising the statistics for aggressive aliens I have created dashboard using pivot charts.



    ACT


    • After concluding analysis of the alien data with no visible relationship between hostility and any of the characteristics provided, I would recommend Starting marketing campaigns in the top 3 most populated states: Texas, California and Florida.
    • It would be also recommended to consider the launching the campaigns in the most populated cities by aggressive aliens outside of the above states: Washington (33.70%) and New York City (16.67%)
      This could give feedback on improvement before introducing the nationwide marketing.
    • The recommended strategy would be to direct the campaigns towards the Males and females between 150-300 years old. This sit the highest group and would generate the most income due to making over a half oof the population of aliens.
      As my personal opinion it would be worth considering going opposite route and focus on inclusivity towards all the genders and age ranges in marketing campaigns, even though the percent of the rest of the groups is quite small - this will help accessing a much bigger group of potential customers as well as building customer trust and authenticity. With people around the world being more aware of discrimination and alienation of minority groups and more actively fighting for equal treatment the lack of diversity can make or break the brand. In recent research performed by YouGov over 60% of consumers taking part in the poll declared that they would more likely make a purchase from a brand that includes diversity in the advertisement.

About

Case study based on dataset created by Ian K. Analysis using Excel pivot tables and pivot charts.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published