# Data Processing

The main purpose of this notebook is to convert the JSON data made available by the [wfcd/warframe-items](https://github.com/WFCD/warframe-items) GitHub repo into formats we can use for our webpage. There are two formats we want to produce: some pickled python objects which will later be passed to a Jinja2 template to generate the webpage, and some JavaScript files which can be imported by the webpage when it is loaded.

The other main purpose of this notebook is cherry picking the data we want; the data provided by wfcd/warframe-items includes far more detail than we want for our purposes, including items' stats, build times, drop locations and chances, patch notes, and more. All we are really interested in are names, components and quantities, image names, and build costs.

#### Setup for this notebook

Download or clone the [wfcd/warframe-items](https://github.com/WFCD/warframe-items) repo. Copy the top-level `data` folder from warframe-items into this repo. This folder contains the JSON files with the item data under `data/json`, and the images for each item under `data/img`.

#### Output of this notebook

- `category_data.pickle`: this contains lists of items, sorted by category (primary, melee, etc). Each item has a name, an ID (generated from the name and used as an HTML element ID), and the file name of the corresponding image. This is used by `main.py` with a Jinja2 template to generate the webpage HTML contents, specifically creating a clickable box for each item with its name and image, with the ID used as the element ID.
- `item-components.js`: this contains a JSON object called `itemComponents` which maps each item's name to an array of components needed to build it. This data is used by the webpage to lookup then sum together all the components needed to build the selected items.
- `all-item-names.js`: this file contains a JSON object called `allItemNames` which maps each item's name to a HTML/JS-usable ID, which matches those in the `category_data.pickle` file.

## Importing the JSON data

First we load in the data from each of the JSON files containing a category of items we are interested in. It is worth noting that we do not need to import the contents of `Arch-Gun.json` as this data is actually duplicated in `Primary.json`.

In [1]:
import json

with open('./data/json/Primary.json') as json_file:
    data_primary = json.load(json_file)

with open('./data/json/Secondary.json') as json_file:
    data_secondary = json.load(json_file)
    
with open('./data/json/Melee.json') as json_file:
    data_melee = json.load(json_file)
    
with open('./data/json/Sentinels.json') as json_file:
    data_sentinels = json.load(json_file)
    
with open('./data/json/Warframes.json') as json_file:
    data_warframes = json.load(json_file)
    
with open('./data/json/Archwing.json') as json_file:
    data_archwing = json.load(json_file)

with open('./data/json/Arch-Melee.json') as json_file:
    data_archmelee = json.load(json_file)

In [52]:
def item_has_components(item):
    return "components" in item.keys()

def recursive_component_finder(item):
    
    item_info = {"name": item["name"]}
    
    if item_has_components(item):
        item_info["buildPrice"] = item["buildPrice"]
        components = []
        for component in item["components"]:
            components.append(recursive_component_finder(component))
        item_info["components"] = components
    else:
        pass
    
    return item_info
    

def print_component_tree(item):
    item_data = recursive_component_finder(item)
    print(json.dumps(item_data, indent=2))
    
for weapon in data_primary:
    if weapon["name"] == "Basmu":
        print_component_tree(weapon)

{
  "name": "Basmu",
  "buildPrice": 25000,
  "components": [
    {
      "name": "Blueprint"
    },
    {
      "name": "Copernics",
      "buildPrice": 1000,
      "components": [
        {
          "name": "Blueprint"
        },
        {
          "name": "Cryotic"
        },
        {
          "name": "Ferrite"
        },
        {
          "name": "Rubedo"
        }
      ]
    },
    {
      "name": "Isos",
      "buildPrice": 1000,
      "components": [
        {
          "name": "Blueprint"
        },
        {
          "name": "Circuits"
        },
        {
          "name": "Fieldron Sample"
        },
        {
          "name": "Salvage"
        }
      ]
    },
    {
      "name": "Nullstones"
    },
    {
      "name": "Pustrels",
      "buildPrice": 1000,
      "components": [
        {
          "name": "Blueprint"
        },
        {
          "name": "Ferrite"
        },
        {
          "name": "Nano Spores"
        },
        {
          "name": "Plastids

In [46]:
for weapon in data_primary:
    if weapon["name"] == "Basmu":
        print(json.dumps(weapon, indent=2))

{
  "name": "Basmu",
  "uniqueName": "/Lotus/Weapons/Sentients/SentRifleNewWar/SentRifleNewWarGun",
  "damagePerShot": [
    0,
    0,
    0,
    19,
    0,
    39,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0,
    0
  ],
  "totalDamage": 58,
  "description": "This Sentient war instrument can either barrage targets with explosive bolts, or, draw on its regenerative battery to create twin plasma beams that chain through targets. When fully drained, Health is leached from nearby foes for a short period.",
  "criticalChance": 0.15000001,
  "criticalMultiplier": 2,
  "procChance": 0.28999996,
  "fireRate": 12.000001,
  "masteryReq": 11,
  "productCategory": "LongGuns",
  "slot": 1,
  "accuracy": 20,
  "omegaAttenuation": 1.1,
  "noise": "Alarming",
  "trigger": "Auto",
  "magazineSize": 21,
  "reloadTime": 0.69999999,
  "multishot": 1,
  "buildPrice": 25000,
  "buildTime": 86400,
  "skipBuildTimePrice": 35,
  "buildQuantity": 1,
  "consumeOnBuil

## Generating item-components.js, all-item-names.js, and category_data.pickle

In [54]:
import pickle

items = {}
safe_names = {}
images = []

def item_has_components(item):
    return "components" in item.keys()

def get_item_components(item):
    """Get the list of components required to build the item in the foundry, including credit cost."""
    components = [{"name": component["name"], "itemCount": component["itemCount"]} for component in item["components"]]
    components.append({"name": "Credits", "itemCount": item["buildPrice"]})
    return components

sub_items_to_ignore = [
    "Forma", "Fieldron", "Detonite Injector", "Mutagen Mass",
    "Morphics", "Gallium", "Neurodes", "Control Module", "Neural Sensors", "Orokin Cell",
    "Copernics", "Pustrels", "Isos", "Carbides", "Cubic Diodes",
    "Auroxium Alloy", "Coprite Alloy", "Fersteel Alloy", "Pyrotic Alloy", "Tear Azurite", "Star Crimzian", "Esher Devar",
    "Heart Nyth", "Radian Sentirum", "Marquise Veridos", 
    "Axidrol Alloy", "Hespazym Alloy", "Travocyte Alloy", "Venerdo Alloy", "Star Amarast", "Goblite Tears", "Heart Noctrul",
    "Smooth Phasmin", "Marquise Thyst", "Radiant Zodian", 
    "Adramal Alloy", "Tempered Bapholite", "Devolved Namalon", "Thaumic Distillate", "Purged Dagonic", "Cabochon Embolos",
    "Purified Heciphron", "Stellated Necrathene", "Faceted Tiametrite", "Trapezium Xenorhast",
]


def extract_useful_data(data):
    """Extract the components needed for each items and store them against the item name in the items dictionary.
    Also, return the names of items added to the dictionary from the current set of provided data.
    """
    item_names = []
    for item in data:
        if item["name"].startswith("Kuva "):
            continue
        if item_has_components(item):
            # Get the item's components 
            item_components = get_item_components(item)
            
            # Check for nested components, and add them to the parent component's costs
            for component in item["components"]:
                if item_has_components(component):
                    # Skip certain items
                    if component["name"] in sub_items_to_ignore:
                        continue
                    
                    # Get the nested components and append them to the item's component list
                    nested_components = get_item_components(component)
                    item_components += nested_components
                    
#                     # Deal with the 3-layer nesting in Lesion and Proboscis Cernos
#                     if item["name"] in ["Lesion", "Proboscis Cernos"]:
#                         for sub_component in component["components"]:
#                             if item["name"] == "Proboscis Cernos": print([a["name"] for a in component["components"]])
#                             if item_has_components(sub_component):
#                                 nested_nested_components = get_item_components(sub_component)
#                                 item_components += nested_nested_components
                                
#                                 if item["name"] == "Proboscis Cernos": print(item_components)
                   
            items[item["name"]] = item_components
            images.append(item["imageName"])
            code_safe_name = item["name"].lower().replace(" ", "-").replace("&", "and")
            safe_names[item["name"]] = code_safe_name
            for component in item["components"]:
                images.append(component["imageName"])
            item_names.append([item["name"], item["imageName"], code_safe_name])
    return item_names

# Extract the costs from all of the loaded in data sets
item_names_archwing = extract_useful_data(data_archwing)
item_names_archmelee = extract_useful_data(data_archmelee)

item_names_primary = extract_useful_data(data_primary)
item_names_secondary = extract_useful_data(data_secondary)
item_names_melee = extract_useful_data(data_melee) + item_names_archmelee
item_names_sentinels = extract_useful_data(data_sentinels)
item_names_warframes = extract_useful_data(data_warframes) + item_names_archwing

# Move the buildable sentinel weapons from Primary to Sentinel
for entry in item_names_primary:
    if entry[0] in ["Cryotra", "Helstrum", "Tazicor", "Vulcax"]:
        item_names_primary.remove(entry)
        item_names_sentinels.append(entry)
    

# item_names_archgun = extract_useful_data(data_archgun)

# Primary weapons includes arch-guns, here we simply remove these entries to prevent duplication
# for entry in item_names_archgun:
#     if entry in item_names_primary:
#         print(entry)
#         item_names_primary.remove(entry)

# print([a for a, _, _ in item_names_primary])
# print([a for a, _, _ in item_names_archgun])

category_data = [
    {"name": "Primary", "item_info": item_names_primary},
    {"name": "Secondary", "item_info": item_names_secondary},
    {"name": "Melee", "item_info": item_names_melee},
    {"name": "Warframes & Vehicles", "item_info": item_names_warframes},
    {"name": "Sentinels", "item_info": item_names_sentinels},
#     {"name": "Archwings", "item_info": item_names_architems},
]

# Save the category data so that main.py can access it easily
pickle.dump( category_data, open("category_data.pickle", "wb"))

# Save the items dictionary as a javascript file, so that it can be directly loaded by the page
with open("./webpage/js/item-components.js", "w") as f:
    f.write("var itemComponents = " + json.dumps(items))
    
with open("./webpage/js/all-item-names.js", "w") as f:
    f.write("var allItemNames = " + json.dumps(safe_names))

In [11]:
import os
from shutil import copyfile

for image in set(images):
    if os.path.isfile("data/img/" + image):
        copyfile("data/img/" + image, "webpage/img/" + image)