# Data Processing

The main purpose of this notebook is to convert the JSON data made available by the [wfcd/warframe-items](https://github.com/WFCD/warframe-items) GitHub repo into formats we can use for our webpage. There are two formats we want to produce: some pickled python objects which will later be passed to a Jinja2 template to generate the webpage, and some JavaScript files which can be imported by the webpage when it is loaded.

The other main purpose of this notebook is cherry picking the data we want; the data provided by wfcd/warframe-items includes far more detail than we want for our purposes, including items' stats, build times, drop locations and chances, patch notes, and more. All we are really interested in are names, components and quantities, image names, and build costs.

#### Setup for this notebook

Download or clone the [wfcd/warframe-items](https://github.com/WFCD/warframe-items) repo. Copy the top-level `data` folder from warframe-items into this repo. This folder contains the JSON files with the item data under `data/json`, and the images for each item under `data/img`.

#### Output of this notebook

- `category_data.pickle`: this contains lists of items, sorted by category (primary, melee, etc). Each item has a name, an ID (generated from the name and used as an HTML element ID), and the file name of the corresponding image. This is used by `main.py` with a Jinja2 template to generate the webpage HTML contents, specifically creating a clickable box for each item with its name and image, with the ID used as the element ID.
- `item-components.js`: this contains a JSON object called `itemComponents` which maps each item's name to an array of components needed to build it. This data is used by the webpage to lookup then sum together all the components needed to build the selected items.
- `all-item-names.js`: this file contains a JSON object called `allItemNames` which maps each item's name to a HTML/JS-usable ID, which matches those in the `category_data.pickle` file.

## Importing the JSON data

First we load in the data from each of the JSON files containing a category of items we are interested in. It is worth noting that we do not need to import the contents of `Arch-Gun.json` as this data is actually duplicated in `Primary.json`.

In [2]:
import json

with open('./data/json/Primary.json') as json_file:
    data_primary = json.load(json_file)

with open('./data/json/Secondary.json') as json_file:
    data_secondary = json.load(json_file)
    
with open('./data/json/Melee.json') as json_file:
    data_melee = json.load(json_file)
    
with open('./data/json/Sentinels.json') as json_file:
    data_sentinels = json.load(json_file)
    
with open('./data/json/Warframes.json') as json_file:
    data_warframes = json.load(json_file)
    
with open('./data/json/Archwing.json') as json_file:
    data_archwing = json.load(json_file)

with open('./data/json/Arch-Melee.json') as json_file:
    data_archmelee = json.load(json_file)

In [9]:
def item_has_components(item):
    return "components" in item.keys()

def recursive_component_finder(item):
    
    item_info = {"name": item["name"]}
    
    if item_has_components(item):
        item_info["buildPrice"] = item["buildPrice"]
        components = []
        for component in item["components"]:
            component_info = {"itemCount": component["itemCount"]}
            component_info = {**component_info, **recursive_component_finder(component)}
            components.append(component_info)
        item_info["components"] = components
    else:
        pass
    
    return item_info
    

def print_component_tree(item):
    item_data = recursive_component_finder(item)
    print(json.dumps(item_data, indent=2))
    
for weapon in data_primary:
    if weapon["name"].startswith("Kuva "):
        print_component_tree(weapon)

{
  "name": "Kuva Ayanga",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Bramma",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Chakkhurr",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Drakgoon",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Hind",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Karak",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Kohm",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,
      "name": "Blueprint"
    }
  ]
}
{
  "name": "Kuva Ogris",
  "buildPrice": 0,
  "components": [
    {
      "itemCount": 1,

## Generating item-components.js, all-item-names.js, and category_data.pickle

In [18]:
import pickle

items = {}
html_ids = {}
images = []

def item_has_components(item):
    """Perform a check to see if the provided item has components listed."""
    return "components" in item.keys()

def get_item_components(item):
    """Get the list of components required to build the item in the foundry, including credit cost."""
    components = [{"name": component["name"], "itemCount": component["itemCount"]} for component in item["components"]]
    components.append({"name": "Credits", "itemCount": item["buildPrice"]})
    return components

# Here we define a list of components whose subcomponents we should ignore; this is mainly for items with rarely used
# recipes, like the rare resources (100 platinum per blueprint??), and for refined gems and minerals which will need their
# own calculations added later due to non 1:1 recipes
sub_items_to_ignore = [
    "Forma", "Fieldron", "Detonite Injector", "Mutagen Mass",
    "Morphics", "Gallium", "Neurodes", "Control Module", "Neural Sensors", "Orokin Cell",
    "Copernics", "Pustrels", "Isos", "Carbides", "Cubic Diodes",
    "Auroxium Alloy", "Coprite Alloy", "Fersteel Alloy", "Pyrotic Alloy", "Tear Azurite", "Star Crimzian", "Esher Devar",
    "Heart Nyth", "Radian Sentirum", "Marquise Veridos", 
    "Axidrol Alloy", "Hespazym Alloy", "Travocyte Alloy", "Venerdo Alloy", "Star Amarast", "Goblite Tears", "Heart Noctrul",
    "Smooth Phasmin", "Marquise Thyst", "Radiant Zodian", 
    "Adramal Alloy", "Tempered Bapholite", "Devolved Namalon", "Thaumic Distillate", "Purged Dagonic", "Cabochon Embolos",
    "Purified Heciphron", "Stellated Necrathene", "Faceted Tiametrite", "Trapezium Xenorhast",
]

def extract_useful_data(data):
    """Extract the components needed for each items and store them against the item name in the items dictionary.
    Also, return the names of items added to the dictionary from the current set of provided data.
    """
    item_names = []  # This will be populated with an entry per item in the format [name, image name, HTML ID]
    
    # Here we iterate over the provided data and for each item populate the item_names list, as well as the global
    # dictionaries "items" and "html_ids", which are used to generate the JavaScript files, and the global list
    # "images" which is used to copy all of the required images to the webpage directory
    for item in data:
        # Skip items with no components and the Kuva weapons, which seem to have a blueprint listed but are not buildable
        if not item_has_components(item) or item["name"].startswith("Kuva "):
            continue
            
        # Get the item's list of components 
        item_components = get_item_components(item)
        
        # Check each component for subcomponents, skipping some we want to ignore such as refined minerals, and if it has
        # subcomponents, append them to the item's total list of components
        for component in item["components"]:
            if item_has_components(component) and component["name"] not in sub_items_to_ignore:
                item_components += get_item_components(component)
                
        # Store the item's list of components in the global items dict, using the item name as the key
        items[item["name"]] = item_components
        
        # Store the item's image name in the global images list, then do the same for each component's image
        images.append(item["imageName"])
        for component in item["components"]:
            images.append(component["imageName"])
        
        # Generate the HTML ID for the item and store it in the global HTML IDs dict, using the item name as the key
        code_safe_name = item["name"].lower().replace(" ", "-").replace("&", "and")
        html_ids[item["name"]] = code_safe_name
        
        # Add the item's name, image name, and HTML ID to the item names list
        item_names.append([item["name"], item["imageName"], code_safe_name])
        
    return item_names

# Extract the costs from all of the loaded datasets
item_names_primary = extract_useful_data(data_primary)
item_names_secondary = extract_useful_data(data_secondary)
item_names_melee = extract_useful_data(data_melee)
item_names_sentinels = extract_useful_data(data_sentinels)
item_names_warframes = extract_useful_data(data_warframes)
item_names_archwing = extract_useful_data(data_archwing)
item_names_archmelee = extract_useful_data(data_archmelee)

# Move the buildable sentinel weapons from Primary to Sentinel
for entry in item_names_primary:
    if entry[0] in ["Cryotra", "Helstrum", "Tazicor", "Vulcax"]:
        item_names_primary.remove(entry)
        item_names_sentinels.append(entry)

# Move the Bonewidow and Voidrig necramechs to the end of the warframes list, to match the codex
item_names_necramechs = []
for entry in item_names_warframes:
    if entry[0] in ["Bonewidow", "Voidrig"]:
        item_names_warframes.remove(entry)
        item_names_necramechs.append(entry)
item_names_warframes += item_names_necramechs

# Append the arch-melee items to the melee item list, and the archwings to the warframes item list, to match the codex
item_names_melee += item_names_archmelee
item_names_warframes += item_names_archwing

# Wrap all of the category item name data into a dictionary and pickle it for later use in the webpage generation script
category_data = [
    {"name": "Primary", "item_info": item_names_primary},
    {"name": "Secondary", "item_info": item_names_secondary},
    {"name": "Melee", "item_info": item_names_melee},
    {"name": "Warframes & Vehicles", "item_info": item_names_warframes},
    {"name": "Sentinels", "item_info": item_names_sentinels},
]
pickle.dump(category_data, open("category_data.pickle", "wb"))

# Save the item components dictionary as a JavaScript file, so that it can be directly loaded by the page
with open("./webpage/js/item-components.js", "w") as f:
    f.write("var itemComponents = " + json.dumps(items))

# Save the item name to HTML IDs dictionary as a JavaScript file, so that it can be directly loaded by the page
with open("./webpage/js/all-item-names.js", "w") as f:
    f.write("var allItemNames = " + json.dumps(html_ids))

## Copying the useful images to the webpage folder

While generating the data in the previous section, we stored the image name of each item we encountered in the `images` list. He we copy each listed image from `data/img` into `webpage/img`. 

In [11]:
import os
from shutil import copyfile

for image in set(images):
    if os.path.isfile("data/img/" + image):
        copyfile("data/img/" + image, "webpage/img/" + image)