# **wants_order_proc**
---

<br><br><br><br>

## **Objectives**

---

<br>

Objective here is to remove a friction point by allowing the user to load lists of wished cards / decks into a centralized folder

<br>

- [**Data Flow Diagram**](/README.md/#data-flow-diagram)

<br><br>

### Previous revision interpreted objectives
1. glue together .csv files from an input directory (`input_wish_list_path`) and output to a directory (`wants_bin`)

- glue logic is good. Let's use that for v2. 
- there are other functions in here that we can use elsewhere

<br><br>

### Ideal objectives
**`process_wants`**
1. script that pulls all .csv and .txt files from a `wants_input` directory and glues them together into `wants_bin`
    - `wants_bin` schema
        - xxx

2. if a file was successfully processed in step 1, then move it out of the wants_input folder and into `wants_processed`
    - acting as a formal archive folder

<br><br>

**`make_order_lists`**:
1. separate `wants_bin` into 2 order outputs: cardkingdom (or other card trading marketplace) and mpcfill.com
    - based on input `_cutoff_price`
2. if separated lists pass the quantity threshold for a given card order point (cardkingdom or makeplayingcards): `order_list` is saved and recorded in archive (`order_lists`)

<br><br>

- inputs:
    - `wants_input`
- outputs:
    - `wants_bin`
    - `order_lists`

<br><br>

### open questions

- x

<br><br><br><br>

## **Imports / Setup Environment**

---

In [35]:
import pandas as pd
from datetime import date

from fp_data_toolbox import eda, notifier

### **Variable Setup**

In [36]:
curr_dt = str(date.today())

wants_input_path = 'C:\\git\\mtg-proj\\data\\input\\wants_input'

wants_bin_filename = 'wants_bin-active.csv'
wants_bin_folder = 'C:\\git\\mtg-proj\\data\\output\\wants_bin'
wants_bin_path = wants_bin_folder+'\\'+wants_bin_filename

wants_bin_arch_name = 'wants_bin-'+curr_dt+'.csv'
wants_bin_arch_folder = 'C:\\git\\mtg-proj\\data\\output\\wants_bin\\archive'
wants_bin_arch_path = wants_bin_arch_folder+'\\'+wants_bin_arch_name

<br><br><br><br>

## **Data Ingestion**

---

some simple data cleaning happens here as well

In [37]:
### import all csv and txt files at once and glue them together
def ingest_wants_input(directory):
    import glob
    import os
    
    df_src=pd.DataFrame()
    for filename in os.scandir(directory):
        if filename.is_file():
            if filename.path.endswith(".csv"):
                
                path, filename_str = os.path.split(filename.path)
                print('csv search hit: ['+filename_str+']')
                # print('Path is %s and file is %s' % (path, filename_str))

                df = pd.read_csv(filename, header=None, low_memory=False)

                # ====================================
                ### splitting quantity and name on single cell to a split column
                df = df.rename(columns={0: "count_nm"})

                count_nm_s = pd.Series(df['count_nm'], index=df.index)

                split_df = count_nm_s.str.split(" ", expand=True, n=1) # split series on first space only and return as a df
                df_stg = split_df
                df_stg["count"]= split_df[0]
                df_stg["name"]= split_df[1]
                df=df_stg.drop(columns=[0,1])

                # ====================================
                ### taking in filename as input for field
                _filename_str=str(filename)
                
                _filename_str = _filename_str.rsplit("'", 1)[0] or filename
                _filename_str = _filename_str.rsplit("'", 1)[1] or filename
                
                df['input_filename'] = _filename_str

                # ====================================
                df_src=pd.concat([df_src,df],axis=0) # write data into output_df
            
            if filename.path.endswith(".txt"):
                ### TODO [ ] write logic for processing txt input files (particularly from moxfield) ;noted_on:2022-09-29
                    ### [ ] find example input .txt from moxfield for testing
                    ### see archive_deck_list
                path, filename_str = os.path.split(filename.path)
                filename_str = filename_str.rsplit("-",2)[0]+".txt"
                print('txt search hit: ['+filename_str+']')
                
                # ====================================
                # ### read in a text file with 'read_csv' below
                df = pd.read_csv(filename.path, sep='\t',header =None,names=['count_nm'])

                # ====================================
                # ### Data cleaning operations here
                count_nm_s = pd.Series(df['count_nm'], index=df.index)
                split_df = count_nm_s.str.split(" ", expand=True, n=1) # split series on first space only and return as a df
                df_stg = split_df
                df_stg["count"]= split_df[0]
                df_stg["name"]= split_df[1]
                df_stg["input_filename"] = filename_str
                df = df_stg.drop(columns=[0,1])
                # ====================================
                df_src=pd.concat([df_src,df],axis=0) # write data into output_df
                
                continue
    return df_src

###=============================================
wants_bin_df_src=ingest_wants_input(wants_input_path)


txt search hit: [budget-cedh-spell-interaction.txt]
csv search hit: [mdfc-list.csv]
csv search hit: [minus the drawback.csv]


In [38]:
# eda.copi_df(wants_bin_df_src)
# wants_bin_df_src

Unnamed: 0,count,name,input_filename
0,1,Abolish,budget-cedh-spell-interaction.txt
1,1,Abrade,budget-cedh-spell-interaction.txt
2,1,Abrupt Decay,budget-cedh-spell-interaction.txt
3,1,An Offer You Can't Refuse,budget-cedh-spell-interaction.txt
4,1,Angel's Grace,budget-cedh-spell-interaction.txt
...,...,...,...
12,1,Stoneforge Mystic,minus the drawback.csv
13,1,Sword of the Animist,minus the drawback.csv
14,1,Tribute Mage,minus the drawback.csv
15,1,Trinket Mage,minus the drawback.csv


In [28]:
#stop

<br><br><br>

## **Data Cleaning**

---

In [29]:
### cleaning input data
wants_bin_df_src=wants_bin_df_src.reset_index()

In [30]:
nm_list = wants_bin_df_src['name'].to_list()
upd_nm_list = []
nm_str_find_list = []

substr = "/ "
inserttxt = "/"

In [31]:
for nm_str in nm_list:
    if nm_str.find(' / ') != -1:
        nm_str = (inserttxt+substr).join(nm_str.split(substr))
    if nm_str.find('//') == -1:
        upd_nm_list.append(nm_str)
        continue
    upd_nm_list.append(nm_str) # append list

wants_bin_df_src['name'] = upd_nm_list

### **Final Reindex**

In [32]:
wants_bin_df_src = wants_bin_df_src.reindex([
    'count',
    'name',
    'input_filename',
], axis=1)

In [34]:
eda.copi_df(wants_bin_df_src)
wants_bin_df_src

Unnamed: 0,count,name,input_filename
0,1,Abolish,budget-cedh-spell-interaction.txt
1,1,Abrade,budget-cedh-spell-interaction.txt
2,1,Abrupt Decay,budget-cedh-spell-interaction.txt
3,1,An Offer You Can't Refuse,budget-cedh-spell-interaction.txt
4,1,Angel's Grace,budget-cedh-spell-interaction.txt
...,...,...,...
347,1,Stoneforge Mystic,minus the drawback.csv
348,1,Sword of the Animist,minus the drawback.csv
349,1,Tribute Mage,minus the drawback.csv
350,1,Trinket Mage,minus the drawback.csv


In [None]:
#stop

<br><br><br><br>

## **Processing**

---

### **Initial Processing filter**

- TODO [ ] write logic for filtering out records from `wants_bin_df_src`: 2022-09-29
    - conditions for cards being processed into `wants_bin`
        - Don't process if card name is invalid 
            - reference against scryfall data, checking for valid card names

In [None]:
### code here
wants_bin_df_src

### **Compare to colletion filtering**

- TODO [ ] write logic for comparing `wants_bin_df_src` to relevant records: 2022-09-29
    - Don't process if we already own `x` units (reference collection_db)
    - Don't process if want_qty > `y` where `y` = collection_db['qty'] and wants_bin['qty']

In [None]:
_max_owned_land_qty = 8
_max_owned_spell_qty = 4
_max_wants_qty = 4

- TODO [ ] scryfall data search for card type (land vs spell): 2022-09-29

In [None]:
### code here

In [None]:
#stop

<br><br><br>

## **Output**

---


In [None]:
wants_bin_df_src.to_csv(wants_bin_path,index=False)
wants_bin_df_src.to_csv(wants_bin_arch_path,index=False)

In [None]:
wants_bin_df_src

In [None]:
# stop