<a href="https://colab.research.google.com/github/mailappserver/LLM-guide/blob/main/WIPO.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Title: Use Power BI with Google Colab in Python to analyze Patent data
Author: Lawrence Teixeira

Date: 16/10/2022


How to connect to Power BI - [Documentation](https://github.com/microsoft/powerbi-jupyter/blob/main/DOCUMENTATION.md#get_pages) - [Power BI Blog](https://powerbi.microsoft.com/pt-br/blog/announcing-power-bi-in-jupyter-notebooks/)

install the Power BI requeriments

In [None]:
!pip install powerbiclient

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting powerbiclient
  Downloading powerbiclient-2.0.1-py2.py3-none-any.whl (787 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m787.5/787.5 KB[0m [31m10.3 MB/s[0m eta [36m0:00:00[0m
Collecting msal>=1.8.0
  Downloading msal-1.21.0-py2.py3-none-any.whl (89 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m89.9/89.9 KB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
Collecting jupyter-ui-poll>=0.1.2
  Downloading jupyter_ui_poll-0.2.2-py2.py3-none-any.whl (9.0 kB)
Collecting PyJWT[crypto]<3,>=1.0.0
  Downloading PyJWT-2.6.0-py3-none-any.whl (20 kB)
Collecting cryptography<41,>=0.6
  Downloading cryptography-39.0.1-cp36-abi3-manylinux_2_28_x86_64.whl (4.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.2/4.2 MB[0m [31m12.0 MB/s[0m eta [36m0:00:00[0m
Collecting jedi>=0.10
  Downloading jedi-0.18.2-py2.py3-none-any.whl (1.6 MB

In [None]:
!pip install powerbiclient

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from powerbiclient import Report, models
from powerbiclient.authentication import DeviceCodeLoginAuthentication
import pandas as pd
from google.colab import drive
from google.colab import output
from urllib import request
import zipfile
import requests

In [None]:
# mount Google Drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


Load the 2022 authority file data from Wipo in [PATENTSCOPE](https://patentscope.wipo.int/search/en/search.jsf)

In [None]:
file_url = "https://patentscope.wipo.int/search/static/authority/2022.zip"

r = requests.get(file_url, stream = True)

with open("/content/gdrive/My Drive/2022.zip", "wb") as file:
	for block in r.iter_content(chunk_size = 1024):
		if block:
			file.write(block)

compressed_file = zipfile.ZipFile('/content/gdrive/My Drive/2022.zip')

csv_file = compressed_file.open('2022.csv')

data = pd.read_csv(csv_file, delimiter=";", names=["Publication Number","Publication Date","Title","Kind Code","Application No","Classification","Applicant","Url"])

In [None]:
#Show the head data
data.head()

Unnamed: 0,Publication Number,Publication Date,Title,Kind Code,Application No,Classification,Applicant,Url
0,sep=,,,,,,,
1,Publication Number,Publication Date,Title,Kind Code,Application No,Classification,Applicant,Url
2,WO/2022/000001,2022-01-06,SELF-CONCEALING SYSTEM FOR THE EASIER INSERTIO...,A1,AP2021/000002,A45D 8/20,"CHIMBWELENGE, Sandra Dac",http://patentscope.wipo.int/search/en/WO202200...
3,WO/2022/000002,2022-01-06,METHOD AND SYSTEM FOR STORING AND OUTPUTTING E...,A1,AT2021/000011,H02J 15/00,"ULRICH, Gregor Anton",http://patentscope.wipo.int/search/en/WO202200...
4,WO/2022/000003,2022-01-06,ADAPTER FOR AN EARPHONE,A1,AT2021/000013,H04R 1/10,FIRST WEST GMBH,http://patentscope.wipo.int/search/en/WO202200...


In [None]:
# Transformations of the csv file dowloaded from wipo

#remove the two fisrt lines
data = data.iloc[1:]
data = data.iloc[1:]

#create a new column with the Classification name
data["Classification_Name"] = data["Classification"].str[:1]

#Modify this column with the classification description
data["Classification_Name"] = data["Classification_Name"].replace({
    'A': 'Human Necessities',
    'B': 'Performing Operations and Transporting',
    'C': 'Chemistry and Metallurgy',
    'D': 'Textiles and Paper',
    'E': 'Fixed Constructions',
    'F': 'Mechanical Engineering',
    'G': 'Physics',
    'H': 'Electricity'
  }
)

In [None]:
#Show again the head data
data.head()

Unnamed: 0,Publication Number,Publication Date,Title,Kind Code,Application No,Classification,Applicant,Url,Classification_Name
2,WO/2022/000001,2022-01-06,SELF-CONCEALING SYSTEM FOR THE EASIER INSERTIO...,A1,AP2021/000002,A45D 8/20,"CHIMBWELENGE, Sandra Dac",http://patentscope.wipo.int/search/en/WO202200...,Human Necessities
3,WO/2022/000002,2022-01-06,METHOD AND SYSTEM FOR STORING AND OUTPUTTING E...,A1,AT2021/000011,H02J 15/00,"ULRICH, Gregor Anton",http://patentscope.wipo.int/search/en/WO202200...,Electricity
4,WO/2022/000003,2022-01-06,ADAPTER FOR AN EARPHONE,A1,AT2021/000013,H04R 1/10,FIRST WEST GMBH,http://patentscope.wipo.int/search/en/WO202200...,Electricity
5,WO/2022/000004,2022-01-06,SUN PROTECTION DEVICE,A1,AT2021/060148,A47C 7/66,"SEELAUS, Franz",http://patentscope.wipo.int/search/en/WO202200...,Human Necessities
6,WO/2022/000006,2022-01-06,"METHOD FOR INTERACTION, MORE PARTICULARLY ENER...",A1,AT2021/060216,B60L 53/12,VOLTERIO GMBH,http://patentscope.wipo.int/search/en/WO202200...,Performing Operations and Transporting


In [None]:
#Save the Excel file in google drive to share with the Power BI report.
data.to_excel("gdrive/MyDrive/datasets/Result_WIPO2022.xlsx")

In [None]:
# Import the DeviceCodeLoginAuthentication class to authenticate against Power BI and initiate the Micrsofot device authentication
device_auth = DeviceCodeLoginAuthentication()

Performing device flow authentication. Please follow the instructions below.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code RQ7Y97G8B to authenticate.

Device flow authentication successfully completed.
You are now logged in.


Click [here](https://carldesouza.com/how-to-get-the-workspace-groupid-and-datasetid-from-the-url-in-power-bi/) to known how to get the workspace and report id from Power BI service.

In [None]:
group_id="fc2c6172-8350-4b64-9053-a0f9f1550c85" #YOU HAVE TO PUT HERE YOUR POWER BI GROUP ID OR WORKSPACE ID
report_id="92ffd7ae-1c40-48ee-bade-7c84ca8b19f4" #YOU HAVE TO PUT HERE YOUR POWER BI REPORT ID

report = Report(group_id=group_id, report_id=report_id, auth=device_auth)
report.set_size(1024, 1600)
output.enable_custom_widget_manager()

In [None]:
# Show the power BI report with the wipo downloaded data.
report

Report(container_height=1024.0, container_width=1600.0)