![star_wars_unsplash](star_wars_unsplash.jpg)

Lego is a household name across the world, supported by a diverse toy line, hit movies, and a series of successful video games. In this project, we are going to explore a key development in the history of Lego: the introduction of licensed sets such as Star Wars, Super Heroes, and Harry Potter.

The introduction of its first licensed series, Star Wars, was a hit that sparked a series of collaborations with more themed sets. The partnerships team has asked you to perform an analysis of this success, and before diving into the analysis, they have suggested reading the descriptions of the two datasets to use, reported below.

## The Data

You have been provided with two datasets to use. A summary and preview are provided below.

## lego_sets.csv

| Column     | Description              |
|------------|--------------------------|
| `"set_num"` | A code that is unique to each set in the dataset. This column is critical, and a missing value indicates the set is a duplicate or invalid! |
| `"name"` | The name of the set. |
| `"year"` | The date the set was released. |
| `"num_parts"` | The number of parts contained in the set. This column is not central to our analyses, so missing values are acceptable. |
| `"theme_name"` | The name of the sub-theme of the set. |
| `"parent_theme"` | The name of the parent theme the set belongs to. Matches the name column of the parent_themes csv file.
|

## parent_themes.csv

| Column     | Description              |
|------------|--------------------------|
| `"id"` | A code that is unique to every theme. |
| `"name"` | The name of the parent theme. |
| `"is_licensed"` | A Boolean column specifying whether the theme is a licensed theme. |

In [2]:
# Import pandas, read and inspect the datasets
import pandas as pd

lego_sets = pd.read_csv('data/lego_sets.csv')
lego_sets.head()

Unnamed: 0,set_num,name,year,num_parts,theme_name,parent_theme
0,00-1,Weetabix Castle,1970,471.0,Castle,Legoland
1,0011-2,Town Mini-Figures,1978,,Supplemental,Town
2,0011-3,Castle 2 for 1 Bonus Offer,1987,,Lion Knights,Castle
3,0012-1,Space Mini-Figures,1979,12.0,Supplemental,Space
4,0013-1,Space Mini-Figures,1979,12.0,Supplemental,Space


In [3]:
parent_themes = pd.read_csv('data/parent_themes.csv')
parent_themes.head()

Unnamed: 0,id,name,is_licensed
0,1,Technic,False
1,22,Creator,False
2,50,Town,False
3,112,Racers,False
4,126,Space,False


In [4]:
# Start coding here
# Use as many cells as you need

In [5]:
import pandas as pd

In [6]:
lego_sets = pd.read_csv('data/lego_sets.csv')
lego_sets.head()

Unnamed: 0,set_num,name,year,num_parts,theme_name,parent_theme
0,00-1,Weetabix Castle,1970,471.0,Castle,Legoland
1,0011-2,Town Mini-Figures,1978,,Supplemental,Town
2,0011-3,Castle 2 for 1 Bonus Offer,1987,,Lion Knights,Castle
3,0012-1,Space Mini-Figures,1979,12.0,Supplemental,Space
4,0013-1,Space Mini-Figures,1979,12.0,Supplemental,Space


In [7]:
parent_themes = pd.read_csv('data/parent_themes.csv')
parent_themes.head()

Unnamed: 0,id,name,is_licensed
0,1,Technic,False
1,22,Creator,False
2,50,Town,False
3,112,Racers,False
4,126,Space,False


In [8]:
lego_sets.dropna()

Unnamed: 0,set_num,name,year,num_parts,theme_name,parent_theme
0,00-1,Weetabix Castle,1970,471.0,Castle,Legoland
3,0012-1,Space Mini-Figures,1979,12.0,Supplemental,Space
4,0013-1,Space Mini-Figures,1979,12.0,Supplemental,Space
5,0014-1,Space Mini-Figures,1979,12.0,Supplemental,Space
10,00-4,Weetabix Promotional Windmill,1976,126.0,Building,Legoland
...,...,...,...,...,...,...
11823,VPorient-1,Orient Expedition Value Pack with LEGO Backpac...,2003,4.0,Orient Expedition,Adventurers
11824,vwkit-1,Volkswagen Kit,1959,22.0,Basic Set,Classic
11826,W991526-1,Homeschool Introduction to Simple and Motorize...,2009,0.0,Technic,Educational and Dacta
11828,Wauwatosa-1,"LEGO Store Grand Opening Exclusive Set, Mayfai...",2012,15.0,LEGO Brand Store,LEGO Brand Store


In [9]:
parent_themes.dropna()

Unnamed: 0,id,name,is_licensed
0,1,Technic,False
1,22,Creator,False
2,50,Town,False
3,112,Racers,False
4,126,Space,False
...,...,...,...
106,605,Nexo Knights,False
107,606,Angry Birds,True
108,607,Ghostbusters,True
109,608,Disney,True


In [10]:
parent_themes_licensed = parent_themes[parent_themes["is_licensed"] == True].value_counts()
parent_themes_licensed

id   name                              is_licensed
158  Star Wars                         True           1
246  Harry Potter                      True           1
607  Ghostbusters                      True           1
606  Angry Birds                       True           1
603  Scooby-Doo                        True           1
602  Jurassic World                    True           1
579  Disney Princess                   True           1
577  Minecraft                         True           1
575  The Lone Ranger                   True           1
570  Teenage Mutant Ninja Turtles      True           1
561  The Hobbit and Lord of the Rings  True           1
482  Super Heroes                      True           1
388  Disney's Mickey Mouse             True           1
317  Avatar                            True           1
275  Toy Story                         True           1
272  SpongeBob SquarePants             True           1
271  Prince of Persia                  True          

In [11]:
#test = np.empty([22,2])
#test[1,:] = parent_themes_licensed.iloc[1]

#for item in parent_themes_licensed:
    #test[item,1] = len(lego_sets["parent_theme"]==item)

#print(test)

In [12]:
is_sw=lego_sets["parent_theme"].isin(["Star Wars"])
is_starwars = lego_sets[is_sw]

In [13]:
is_sw_num = len(is_starwars)
print(is_sw_num)

609


In [14]:
licensed = parent_themes["is_licensed"] == True
sw = parent_themes["name"] == "Star Wars"
sw_licensed = parent_themes[licensed & sw]
print(sw_licensed)

    id       name  is_licensed
7  158  Star Wars         True


In [15]:
new_era = lego_sets[lego_sets["parent_theme"] == "Star Wars"]["year"].max()
print(new_era)


2017
