Skip to content

Pyligent/python-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python-challenge

Python Programming / Pandas / MatlibPlot

PyBank

  • In this challenge, you are tasked with creating a Python script for analyzing the financial records of your company. You will give a set of financial data called budget_data.csv. The dataset is composed of two columns: Date and Profit/Losses. (Thankfully, your company has rather lax standards for accounting so the records are simple.)

  • Your task is to create a Python script that analyzes the records to calculate each of the following:

    • The total number of months included in the dataset

    • The total net amount of "Profit/Losses" over the entire period

    • The average change in "Profit/Losses" between months over the entire period

    • The greatest increase in profits (date and amount) over the entire period

    • The greatest decrease in losses (date and amount) over the entire period

  • As an example, your analysis should look similar to the one below:

    Financial Analysis
    ----------------------------
    Total Months: 86
    Total: $38382578
    Average  Change: $-2315.12
    Greatest Increase in Profits: Feb-2012 ($1926159)
    Greatest Decrease in Profits: Sep-2013 ($-2196167)
    
  • In addition, your final script should both print the analysis to the terminal and export a text file with the results.

PyPoll

  • In this challenge, you are tasked with helping a small, rural town modernize its vote-counting process. (Up until now, Uncle Cleetus had been trustfully tallying them one-by-one, but unfortunately, his concentration isn't what it used to be.)

  • You will be give a set of poll data called election_data.csv. The dataset is composed of three columns: Voter ID, County, and Candidate. Your task is to create a Python script that analyzes the votes and calculates each of the following:

    • The total number of votes cast

    • A complete list of candidates who received votes

    • The percentage of votes each candidate won

    • The total number of votes each candidate won

    • The winner of the election based on popular vote.

  • As an example, your analysis should look similar to the one below:

    Election Results
    -------------------------
    Total Votes: 3521001
    -------------------------
    Khan: 63.000% (2218231)
    Correy: 20.000% (704200)
    Li: 14.000% (492940)
    O'Tooley: 3.000% (105630)
    -------------------------
    Winner: Khan
    -------------------------
    

PyBoss

In this challenge, you get to be the boss. You oversee hundreds of employees across the country developing Tuna 2.0, a world-changing snack food based on canned tuna fish. Alas, being the boss isn't all fun, games, and self-adulation. The company recently decided to purchase a new HR system, and unfortunately for you, the new system requires employee records be stored completely differently.

Your task is to help bridge the gap by creating a Python script able to convert your employee records to the required format. Your script will need to do the following:

  • Import the employee_data1.csv and employee_data2.csv files, which currently holds employee records like the below:
Emp ID,Name,DOB,SSN,State
214,Sarah Simpson,1985-12-04,282-01-8166,Florida
15,Samantha Lara,1993-09-08,848-80-7526,Colorado
411,Stacy Charles,1957-12-20,658-75-8526,Pennsylvania
  • Then convert and export the data to use the following format instead:
Emp ID,First Name,Last Name,DOB,SSN,State
214,Sarah,Simpson,12/04/1985,***-**-8166,FL
15,Samantha,Lara,09/08/1993,***-**-7526,CO
411,Stacy,Charles,12/20/1957,***-**-8526,PA
  • In summary, the required conversions are as follows:

    • The Name column should be split into separate First Name and Last Name columns.

    • The DOB data should be re-written into MM/DD/YYYY format.

    • The SSN data should be re-written such that the first five numbers are hidden from view.

    • The State data should be re-written as simple two-letter abbreviations.

PyParagraph

In this challenge, you get to play the role of chief linguist at a local learning academy. As chief linguist, you are responsible for assessing the complexity of various passages of writing, ranging from the sophomoric Twilight novel to the nauseatingly high-minded research article. Having read so many passages, you've since come up with a fairly simple set of metrics for assessing complexity.

Your task is to create a Python script to automate the analysis of any such passage using these metrics. Your script will need to do the following:

  • Import a text file filled with a paragraph of your choosing.

  • Assess the passage for each of the following:

    • Approximate word count

    • Approximate sentence count

    • Approximate letter count (per word)

    • Average sentence length (in words)

  • As an example, this passage:

“Adam Wayne, the conqueror, with his face flung back and his mane like a lion's, stood with his great sword point upwards, the red raiment of his office flapping around him like the red wings of an archangel. And the King saw, he knew not how, something new and overwhelming. The great green trees and the great red robes swung together in the wind. The preposterous masquerade, born of his own mockery, towered over him and embraced the world. This was the normal, this was sanity, this was nature, and he himself, with his rationality, and his detachment and his black frock-coat, he was the exception and the accident a blot of black upon a world of crimson and gold.”

...would yield these results:

Paragraph Analysis
-----------------
Approximate Word Count: 122
Approximate Sentence Count: 5
Average Letter Count: 4.6
Average Sentence Length: 24.0

Pandas Challenge

Hero Of Pymoli

Like many others in its genre, the game is free-to-play, but players are encouraged to purchase optional items that enhance their playing experience. As a first task, the company would like you to generate a report that breaks down the game's purchasing data into meaningful insights.

Your final report should include each of the following:

Player Count

  • Total Number of Players

Purchasing Analysis (Total)

  • Number of Unique Items
  • Average Purchase Price
  • Total Number of Purchases
  • Total Revenue

Gender Demographics

  • Percentage and Count of Male Players
  • Percentage and Count of Female Players
  • Percentage and Count of Other / Non-Disclosed

Purchasing Analysis (Gender)

  • The below each broken by gender
    • Purchase Count
    • Average Purchase Price
    • Total Purchase Value
    • Average Purchase Total per Person by Gender

Age Demographics

  • The below each broken into bins of 4 years (i.e. <10, 10-14, 15-19, etc.)
    • Purchase Count
    • Average Purchase Price
    • Total Purchase Value
    • Average Purchase Total per Person by Age Group

Top Spenders

  • Identify the the top 5 spenders in the game by total purchase value, then list (in a table):
    • SN
    • Purchase Count
    • Average Purchase Price
    • Total Purchase Value

Most Popular Items

  • Identify the 5 most popular items by purchase count, then list (in a table):
    • Item ID
    • Item Name
    • Purchase Count
    • Item Price
    • Total Purchase Value

Most Profitable Items

  • Identify the 5 most profitable items by total purchase value, then list (in a table):
    • Item ID
    • Item Name
    • Purchase Count
    • Item Price
    • Total Purchase Value

Matlibplot Challenge

Pyber

The ride sharing bonanza continues! Seeing the success of notable players like Uber and Lyft, you've decided to join a fledgling ride sharing company of your own. In your latest capacity, you'll be acting as Chief Data Strategist for the company. In this role, you'll be expected to offer data-backed guidance on new opportunities for market differentiation.

Build a Bubble Plot that showcases the relationship between four key variables:

  • Average Fare ($) Per City
  • Total Number of Rides Per City
  • Total Number of Drivers Per City
  • City Type (Urban, Suburban, Rural)

In addition, you will be expected to produce the following three pie charts:

  • % of Total Fares by City Type
  • % of Total Rides by City Type
  • % of Total Drivers by City Type

About

Python Programming/Pandas/Matlibplot/API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published