![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

# Data Science Hackathon

As our world becomes increasingly data-driven, the ability to analyze, visualize, and draw insights from large and complex datasets is becoming an essential skill. Data science can provide you with the tools to make better decisions and solve complex problems.

Click on the cell below, then click the Run button above to import and display a dataframe about *hypothetical* pets for adoption from our friends at [Bootstrap](https://www.bootstrapworld.org/materials/data-science).

In [24]:
import pandas as pd
pets = pd.read_csv('https://raw.githubusercontent.com/callysto/data-files/main/data-science-and-artificial-intelligence/pets.csv')
pets

Unnamed: 0,Name,Species,Gender,Age (years),Fixed,Legs,Weight (lbs),Time to Adoption (weeks)
0,Sasha,cat,female,1,False,4,6.5,3
1,Mittens,cat,female,2,True,4,7.4,1
2,Sunflower,cat,female,5,True,4,8.1,6
3,Sheba,cat,female,7,True,4,8.4,6
4,Felix,cat,male,16,True,4,9.2,5
5,Snowcone,cat,female,2,True,4,6.5,5
6,Wade,cat,male,1,False,4,3.2,1
7,Hercules,cat,male,3,False,4,13.4,2
8,Toggle,dog,female,3,True,4,48.0,1
9,Boo-boo,dog,male,11,True,4,123.0,24


To create visualizations, we will use [Plotly Express](https://plotly.com/python/plotly-express). Click on the cell below then click Run to display a `bar` graph.

In [None]:
import plotly.express as px
px.bar(pets, x='Name', y='Age (years)', title='Pets Ages')

## Beginner Challenges

Each of these challenges is worth 2 points, and uses the `pets` dataframe.

### Beginner Visualizations

1. make a bar graph with `Name` on the x-axis and `Legs` on the y-axis
2. make a bar graph using
```
x='Species', y='Age (years)', color='Gender'
```
3. recreate the previous bar graph, but add
```
, barmode='group'
```
4. make a line graph by changing `bar` to `line` from the example
```
px.bar(pets, x='Name', y='Age (years)', title='Pets Ages')
```
5. make a scatter plot with
```
px.scatter(pets, x='Name', y='Age (years)')
```
6. recreate the previous scatter plot, but add a title
7. recreate the previous scatter plot and color the points by `Gender`
8. make a scatter plot comparing `Age (years)` with `Weight (lbs)`
9. recreate the previous scatter plot, but add
```
, size='Time to Adoption (weeks)'
```

## Beginner Data Analysis

1. sort the `pets` dataframe by `Age (years)`
```
pets.sort_values('Age (years)')
```
2. sort by age in descending order by adding
```
, ascending=False
```
3. create a bar graph of the sorted dataframe
4. display just one column of the dataframe
```
pets['Species']
```
5. display a few columns of the dataframe
```
pets[['Name', 'Legs', 'Time to Adoption (weeks)']]
```
6. display just the dogs
```
pets[pets['Species']=='dog']
```
7. display all the animals that are not dogs
```
pets[pets['Species']!='dog']
```
8. filter the `pets` dataframe to show just cats and dogs, using
```
pets[pets['Species'].isin(['cat', 'dog'])]
```
9. create a pie chart of the grouped data using the following code:
```
species_counts = pets.groupby('Species').size()
px.pie(values=species_counts, names=species_counts.index)
```
10. recreate the previous pie chart, and add a title
11. find the average (mean) of a column with
```
pets['Age (years)'].mean()
```
12. find the median value of a different column


create a new column: pets['Mass (kg)'] = pets['Weight (lbs)'] / 2.205
create a new column: Age (days)
pets.groupby('Fixed').mean()
pets.groupby('Species').size()
sum()
min() and max()
histogram: count (default)
histogram: avg

sunburst

Other Data:

Pokemon, Open Data (e.g. zoo), APIs (Pokeapi, nba_api, Spotify)




[![Callysto.ca License](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-bottom.jpg?raw=true)](https://github.com/callysto/curriculum-notebooks/blob/master/LICENSE.md)