# Data Collection

This Jupyter Notebook is for collecting data using the [Scryfall API](https://scryfall.com/docs/api).  

The Scryfall API provides a way to download [bulk data](https://scryfall.com/docs/api/bulk-data) of all the Magic: The Gathering cards.  
We're interested in the **Default Cards** file, as it contains all Magic: The Gathering cards including reprints.  
The download URI for the Default Cards json file can be retrieved from the bulk data request. 

In [3]:
import requests

r = requests.get('https://api.scryfall.com/bulk-data')

if r.status_code == 200:
    response_dict = r.json()
    # get the download uri for the most recent bulk data default cards json file.
    default_cards_download_uri = response_dict['data'][2]['download_uri']
    # get and save the cards in a json file.
    cards_r = requests.get(default_cards_download_uri)
    if cards_r.status_code == 200:
        with open('bulk_data_default_cards.json', 'wb') as f:
            f.write(cards_r.content)

# Data Preprocessing

We're going to perform the following preprocessing on the default cards data:
- Keep cards in English only
- Keep cards in main sets only

In [2]:
# First off let us load the card data into a Pandas DataFrame:
import pandas

magic_df = pandas.read_json('bulk_data_default_cards.json')

In [3]:
# Take a look at the head of the dataframe to make sure everything went smoothly.
magic_df.head()

Unnamed: 0,object,id,oracle_id,multiverse_ids,mtgo_id,mtgo_foil_id,tcgplayer_id,cardmarket_id,name,lang,...,card_faces,tcgplayer_etched_id,color_indicator,life_modifier,hand_modifier,printed_type_line,printed_text,content_warning,flavor_name,variation_of
0,card,0000579f-7b35-4ed3-b44c-db2a538066fe,44623693-51d6-49ad-8cd7-140505caf02f,[109722],25527.0,25528.0,14240.0,13850.0,Fury Sliver,en,...,,,,,,,,,,
1,card,00006596-1166-4a79-8443-ca9f82e6db4e,8ae3562f-28b7-4462-96ed-be0cf7052ccc,[189637],34586.0,34587.0,33347.0,21851.0,Kor Outfitter,en,...,,,,,,,,,,
2,card,0000a54c-a511-4925-92dc-01b937f9afad,dc4e2134-f0c2-49aa-9ea3-ebf83af1445c,[],,,98659.0,,Spirit,en,...,,,,,,,,,,
3,card,0000cd57-91fe-411f-b798-646e965eec37,9f0d82ae-38bf-45d8-8cda-982b6ead1d72,[435231],65170.0,65171.0,145764.0,301766.0,Siren Lookout,en,...,,,,,,,,,,
4,card,00012bd8-ed68-4978-a22d-f450c8a6e048,5aa12aff-db3c-4be5-822b-3afdf536b33e,[1278],,,1623.0,5664.0,Web,en,...,,,,,,,,,,


As can be seen, cards have a lot of properties, 83 to be exact! But don't worry we won't need all 83 of them, 
so let's get familiar with the properties and cut them down to a more overseeable count.