# Inventor Disambiguation Summaries

## Dependencies

#### Development environment:

- Need conda.
- Development environment can be created and updated using:
    ```shell
    cd PatentsView-Evaluation
    make env
    conda activate pv-evaluation
    ```
- Package **pv-evaluation** can be installed using `pip install -e .`

#### Package imports

In [1]:
import wget
import zipfile
import os

import plotly.io as pio
pio.templates.default = "plotly_white" # Set plotly theme

from pv_evaluation.summary import InventorDisambiguationSummary

#### Data download

The dataset `rawinventor.tsv` from PatentsView's bulk data download should be in this repository.

In [2]:
if not os.path.isfile("rawinventor.tsv"):
    wget.download("https://s3.amazonaws.com/data.patentsview.org/download/rawinventor.tsv.zip")
    with zipfile.ZipFile("rawinventor.tsv.zip", 'r') as zip_ref:
        zip_ref.extractall(".")
    os.remove("rawinventor.tsv.zip")

## Examples

#### Setting up InventorDisambigationSummary object

The option `processed_data_dir` is set to the home directory in order to save and re-use processed data across executions of this notebook.

In [3]:
summarizer = InventorDisambiguationSummary("rawinventor.tsv", processed_data_dir="~/")

#### Cluster size distribution data

In [None]:
summarizer.get_cluster_size_distribution()

Unnamed: 0,Number of patents,Number of inventors
0,1,1754888
1,2,636677
2,3,340817
3,4,218452
4,5,151663
...,...,...
645,577,1
646,582,1
647,586,1
648,593,1


#### Cluster size distribution plot

In [None]:
summarizer.plot_cluster_size_distribution(range=(1,10))

#### Top inventors

In [None]:
summarizer.get_top_inventors()

Unnamed: 0_level_0,Number of patents,name_first,name_last
inventor_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
fl:sh_ln:yamazaki-81,6027,Shunpei,Yamazaki
fl:ki_ln:silverbrook-1,4778,Kia,Silverbrook
fl:ka_ln:cheng-65,2523,Kangguo,Cheng
fl:jo_ln:ive-8,2084,Jonathan P.,Ive
fl:lo_ln:wood-15,1956,Lowell L.,"Wood, Jr."
fl:ro_ln:hyde-3,1879,Roderick A.,Hyde
fl:du_ln:kerr-15,1738,Duncan Robert,Kerr
fl:ba_ln:andre-17,1716,Bartley K.,Andre
fl:ri_ln:howarth-13,1632,Richard P.,Howarth
fl:ch_ln:stringer-16,1612,Christopher J.,Stringer
