<a href="https://colab.research.google.com/github/mercariku/product-proposal/blob/main/MoneyTree_ProductProposal.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Table of contents

>[Table of contents](#scrollTo=y0lNbFOPXQvz)

>[New product proposal by Riku Driscoll](#scrollTo=ns82jDg95jra)

>>[Problem Statement](#scrollTo=kJcLmelxTzkx)

>>[Idea Statement](#scrollTo=fnlp0XXFZjzB)

>>[Discovery Activities](#scrollTo=90rIVHF4benU)

>>[Stakeholders / Business Strategy](#scrollTo=C2PjqfwrbxwL)

>>[Success Criteria](#scrollTo=zFeeLrC5b1Ye)

>>[Appendix](#scrollTo=tzEMkwQEb4pE)

>>>[If I had more time...](#scrollTo=tzEMkwQEb4pE)

>>>[Resource / Reference](#scrollTo=tzEMkwQEb4pE)

>[Import Datasets](#scrollTo=0a7_27cDW6Dz)



# New product proposal by Riku Driscoll

## Problem Statement

With the given datasets, there are several problems currently at large. 
- Explainability
  - Inability to explain (or infer) data
- Traceability
  - Unable to trace data without the help of other datasets
- Actionable insights
  - Given datasets provide only basic information and no intelligence (actionable insight)

## Idea Statement

- I propose to create a proprietary **MoneyTree Consumer Price Index** extended by the complimentary product, **Embedded CPI**
- The product is an Alternative Data as a Service, providing proprietary CPI data using a combination of MoneyTree datasets coupled with publicly available datasets. Unlike a traditional CPI, MCPI and EmbeddedCPI can also measure/gauge consumer sentiment (similar to the Michigan consumer sentiment index). 
- One of many usecases of the economic index is to help represent "the psychology of the entire [Japanese consumer behavior, sentiment] at any given moment. [MCPI] is an exact thmbprint of that moment." ([Reference to the show, Billions](https://www.reddit.com/r/Billions/comments/elr833/taylor_and_nasa_man_quote_failed_quant_hire/?utm_source=share&utm_medium=ios_app&utm_name=iossmf))
- The economic index would provide business intelligence to customers (in both private and public sectors, academic researchers, think tanks, etc.)
- In the private sector, the economic index would provide actionable intelligence, allowing organizations to review their business strategies and reallocate resources accordingly
- In the public sector, relevent government agencies can use the economic index as a signal to propose EBPM
- In the context of academic research, the economic index can be used to facilitate ongoing studies in econometrics (promoting collaboration with academics while improving the economic index)

## Discovery Activities

- Identifying roadmaps / milstones based on limited resources
- We must research the credibility of datasets, especially to minimize biases. The key is to expand and diversify data sources and corroborate to make sure the data is not skewed, etc. (i.e. age, location, payment method, etc) 
- Create a working model prototype to conduct a proof-of-concept trial to understand customer journies (In an attempt to point out any blind spots, or unknown unknowns & known unknowns)
- Research similar companies providing similar service (e.g. Nowcast, Macromoney) and create a business model that provides continuous stream of income with the ability to scale (while providing big enough moat to secure a position in the market)
- Identify deficiencies and prioritize resources: Can we create a product internally? or does a partnership make sense? (Thinking both short/medium and long run sucess)
- Identify points of failure. For example, what happens if datasource becomes unavailable for some reason?

## Stakeholders / Business Strategy
- Different stages of maturity will require different Stakeholders
- In the early stage, the team would rely heavily on engineers, econometric professionals, product managers (from other teams) and design researchers (UX/UI, academia, head of products, etc.). Here, the key is to repurpose as much resources as possible, as fast as possible, to operationalize and scale the product.
- During the growth stage, I believe, it's important to involve more senior members in the loop to help the product expand. Inevitably we would expand the engineering team to support added demand. Here, the key is to gain exposure and learn strengths and weaknesses of the product, allowing us to review strategies. Strategic transformation and change management can only happen with the help of senior members' input. 
- Once we reach a certain maturity, I believe, it's important to reassess each moving parts within the product to trim excess resources. The goal is to be easily maintainable and scalable (e.g. I propose a 100% serverless application on AWS using AWS SAM, an AWS native IaC to enable visibility, explainability and traceability). 
- Throughout every stage, it's imporatant to include compliance in the loop. My goal is to create a product that successfully influences the financial market, the consumer behavior, and government policy making. It's crucial to maintain sustainable growth by focusing on the "G" of ESG.
- In every stage, it's crucial to gain feedbacks internally AND externally while iterating the product to best suit customer needs. 

## Success Criteria

- The goal is to "create products that help people & trusted organizations harness the power of data for good. Our plan for getting there is to deliver value through transparent insights that help people make clear financial decisions". 
- To gauge and measure product success, we would collect both qualitative and quantitive feedbacks
- Qualitative
  - Number of data source and partnerships (e.g. gauging credibility based on the number of "reviewers", "auditors")
  - Customer journey + time to impact (e.g. Getting feedbacks on usability)
  - The product would have a feature that allows users to trace specific data: Number of TRACE activity could indicate impact
- Quantitative
  - MAU (e.g. monitoring user access of Embedded CPI)
  - Number of projects (e.g. monitoring the number of projects the economic index is involved in)
  - Reliability (e.g. monitoring availability - or the downtime - to gauge technical reliability)
  - Adoption rate
  - Engagement rate, Retention rate
  - Return on Capital, Gross Margin
  - Impact
    - Empower organizations with actionable insights that enables investments of different kinds. 
    - Here, my ultimate goal is to create a small investment fund within MoneyTree that generates alternative revenue: The investment strategy hinged on the MCPI and Model Vector Autoregression.
  

## Appendix

### If I had more time...
- I'd conduct an ontology analysis and inference analysis to potentially discover correlations
- I'd bring other data sources (like location) to better understand user behavior
- I'd create a knowledge graph to easily identify relationships between transactions, etc. 
- I'd do more research on how to improve the data product lifecycle, so I can provide better answers to "Discovery activities" and "Stakeholders" sections. 

### Resource / Reference
- ["消費者物価指数の作り方"](https://www.stat.go.jp/data/cpi/2015/mikata/pdf/2.pdf)
- [Statistics Bureau of Japan](https://www.stat.go.jp/english/data/cpi/1585.html)
- [Mainstream CPIs around the world and assets affected by each CPI](https://www.avatrade.com/education/economic-indicators/fundamental-indicators/consumer-price-index)
- [IMF - CPI Manual - Concepts and Methods](https://www.imf.org/-/media/Files/Data/CPI/cpi-manual-concepts-and-methods.ashx)

# Import Datasets

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
import pandas as pd

df_past = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/datasets/past.csv")
display(df_past)

Unnamed: 0,date,description_raw,amount,created_at,updated_at,account_id,description_guest,description_pretty,category_id,expense_type,claim_id,expense_type_categorizer,id,raw_transaction_id
0,2019-07-17 15:00:00,フアミリ―マ―トトミタエキマエ,-100.0,2019-07-30 02:28:24.463617,2019-07-30 02:28:24.463617,13198024,,フアミリ―マ―トトミタエキマエ,14,0,,0,1644490565,1617410773
1,2019-07-19 15:00:00,すき家,-960.0,2019-07-30 02:28:06.799996,2019-07-30 02:28:06.799996,12135256,,すき家,27,0,,0,1644490010,1617410223
2,2019-07-26 15:00:00,ドン・キホーテ 八千代店,-3530.0,2019-07-30 02:29:43.488975,2019-07-30 02:29:43.488975,5797835,,ドン・キホーテ 八千代店,40,0,,0,1644492444,1617412636
3,2019-07-21 15:00:00,ソラシド　エア　インターネット,-23990.0,2019-07-30 02:28:22.028402,2019-07-30 02:28:22.028402,5719110,,ソラシド エア インターネット,73,0,,0,1644490516,1617410724
4,2019-08-12 15:00:00,手数料・利息,-6906.0,2019-07-30 02:29:46.443417,2019-07-30 02:29:46.443417,7625184,,リボ払い 手数料,35,0,,0,1644492519,1617412710
5,2019-07-24 15:00:00,大戸屋　阪急大井町ガーデン店,-1430.0,2019-07-30 02:29:55.618318,2019-07-30 02:29:55.618318,10449422,,大戸屋 阪急大井町ガーデン店,40,0,,0,1644492746,1617412934
6,2019-07-23 15:00:00,ペイペイ＊クスリノアオキ,-1195.0,2019-07-30 02:28:28.362856,2019-07-30 02:28:28.362856,7363778,,ペイペイ*クスリノアオキ,40,0,,0,1644490641,1617410849
7,2019-07-20 15:00:00,セブン－イレブン岐阜粟野西１丁目店／ｉＤ,-750.0,2019-07-30 02:29:56.008265,2019-07-30 02:29:56.008265,3843092,,セブンーイレブン岐阜粟野西1丁目店/iD,14,0,,0,1644492753,1617412942
8,2019-07-26 15:00:00,ＧＯＯＧＬＥ ＊ＬＩＮＥ ＣＯＲＰ,-240.0,2019-07-30 02:29:18.401709,2019-07-30 02:29:18.401709,10904426,,GOOGLE *LINE CORP,26,0,,0,1644491867,1617412067
9,2019-07-18 15:00:00,ﾒﾙｶﾘ /,-800.0,2019-07-30 02:28:34.354073,2019-07-30 02:28:34.354073,4217343,,メルカリ /,12,0,,0,1644490791,1617410995


In [4]:
df_recent = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/datasets/recent.csv")
display(df_recent)

Unnamed: 0,date,description_raw,amount,created_at,updated_at,account_id,description_guest,description_pretty,category_id,expense_type,claim_id,expense_type_categorizer,id,raw_transaction_id
0,2020-07-13 15:00:00,天下一品 天王寺店／ｉＤ,-1500.0,2020-07-17 02:27:20.556657,2020-07-17 02:27:20.556657,5942643,,天下一品 天王寺店/iD,27,0,,0,2544490000,2527899666
1,2020-07-06 15:00:00,楽天西友ネットスーパー 790001,-5959.0,2020-07-17 02:27:20.605587,2020-07-17 02:27:20.605587,13043578,,楽天西友ネットスーパー 790007,40,0,,0,2544490002,2527899668
2,2020-05-26 15:00:00,エポスカード ７月分家賃,-55000.0,2020-07-17 02:27:20.610542,2020-07-17 02:27:20.610542,18949857,,エポスカード 7月分家賃,50,0,,0,2544490003,2527899669
3,2020-07-14 15:00:00,フレスコ河原町丸太町店／ｉＤ,-751.0,2020-07-17 02:27:20.60067,2020-07-17 02:27:20.60067,5942643,,フレスコ河原町丸太町店/iD,40,0,,0,2544490005,2527899671
4,2020-07-14 15:00:00,ＡＭＡＺＯＮ．ＣＯ．ＪＰ,-3368.0,2020-07-17 02:27:20.639536,2020-07-17 02:27:20.639536,5942643,,AMAZON.CO.JP,11,0,,0,2544490007,2527899674
5,2020-07-06 15:00:00,楽天ＳＰ　オーケー　川口店 722010,-6642.0,2020-07-17 02:27:20.667881,2020-07-17 02:27:20.667881,13043578,,楽天SP オーケー 川口店 710010,40,0,,0,2544490008,2527899673
6,2020-07-14 15:00:00,ＪＲ東日本 えきねっと,-15010.0,2020-07-17 02:27:20.664026,2020-07-17 02:27:20.664026,10526151,,JR東日本 えきねっと,71,0,,0,2544490009,2527899675
7,2020-05-26 15:00:00,エポスカード ７月分保証料,-820.0,2020-07-17 02:27:20.713364,2020-07-17 02:27:20.713364,18949857,,エポスカード 7月分保証料,50,0,,0,2544490011,2527899678
8,2020-07-14 15:00:00,ＡＭＡＺＯＮ．ＣＯ．ＪＰ,-3454.0,2020-07-17 02:27:20.685683,2020-07-17 02:27:20.685683,5942643,,AMAZON.CO.JP,11,0,,0,2544490012,2527899677
9,2020-07-09 15:00:00,ドミノ・ピザ 720197,-972.0,2020-07-17 02:27:20.737485,2020-07-17 02:27:20.737485,13043578,,ドミノ・ピザ 720197,27,0,,0,2544490013,2527899679


In [8]:
arr_past = df_past["description_pretty"].unique()
for i in arr_past:
  print(i)

フアミリ―マ―トトミタエキマエ
すき家
ドン・キホーテ 八千代店
ソラシド エア インターネット
リボ払い 手数料
大戸屋 阪急大井町ガーデン店
ペイペイ*クスリノアオキ
セブンーイレブン岐阜粟野西1丁目店/iD
GOOGLE *LINE CORP
メルカリ /
セイコーマート/iD
Yストア
スマートEX(JR東海)
AMAZON.CO.JP
出光 
P-TALK
QP/フアミリーマート
フアミリーマートセイロカガーデン
nanacoクレジットチャージ
ACADEMIA くまざわ書店
スーパーミラベルニシスガモテン
出光興産 148762 02 37.00
ジャパンタクシー
バロー デンポウジテン
d払いB スギ薬局グループ 
華正樓 新館売店
シネマシテイ
STAATSOPER FUEHRUNGE WIEN
文化堂 豊洲店
CHILDRENSALO(441892779110)
ファミリーマート/iD
おさかな本舗たいこ茶屋
クルメホテル エスプリ
ヤフーウォレット決済*クリックポ
UNTERES BELVEDERE WIEN
JR東日本モバイルSuica
BILL ITUNES COM
ボンドガール 2ND
Amazon Downloads
WAONオート
キラヤ 大島店
太陽石油
日本生命保険 相互会社
小牧セントラルホテル
マルエツプチ ニシシンジユクサンチヨウメ
ETC 関西支社
ローソン札幌北7条西一丁目
