In [1]:
import numpy as np
import pandas as pd
np.random.seed(12345)
import matplotlib.pyplot as plt

Neste exemplo vou utilizar uma API publica que vai gerar um retorno no formato JSON com os "issues" e/ou propostas de melhoria do python:

Primeiro passo é preparar o acesso, para isso vou usar o pacote "requests"

In [2]:
import requests

In [3]:
url = "https://api.github.com/repos/pandas-dev/pandas/issues"

In [7]:
resp = requests.get(url)

In [10]:
resp.raise_for_status()

In [11]:
resp

<Response [200]>

É uma boa prática sempre chamar raise_for_status depois de usar requests.get para verificar se há erros de HTTP.

O método json do objeto de resposta retornará um objeto Python contendo os dados JSON analisados como um dicionário ou lista (dependendo de qual JSON é retornado):

In [12]:
data = resp.json()

Vamos ver o primeiro elemento do JSON, que neste caso será um dicionário com muitas informações:

In [13]:
data[0]

{'url': 'https://api.github.com/repos/pandas-dev/pandas/issues/55297',
 'repository_url': 'https://api.github.com/repos/pandas-dev/pandas',
 'labels_url': 'https://api.github.com/repos/pandas-dev/pandas/issues/55297/labels{/name}',
 'comments_url': 'https://api.github.com/repos/pandas-dev/pandas/issues/55297/comments',
 'events_url': 'https://api.github.com/repos/pandas-dev/pandas/issues/55297/events',
 'html_url': 'https://github.com/pandas-dev/pandas/pull/55297',
 'id': 1913395391,
 'node_id': 'PR_kwDOAA0YD85bOaTP',
 'number': 55297,
 'title': 'ExtensionArray.interpolate() method and tests',
 'user': {'login': 'andrewgsavage',
  'id': 13157776,
  'node_id': 'MDQ6VXNlcjEzMTU3Nzc2',
  'avatar_url': 'https://avatars.githubusercontent.com/u/13157776?v=4',
  'gravatar_id': '',
  'url': 'https://api.github.com/users/andrewgsavage',
  'html_url': 'https://github.com/andrewgsavage',
  'followers_url': 'https://api.github.com/users/andrewgsavage/followers',
  'following_url': 'https://api.git

Importante nota que os resultados recuperados são baseados em dados em tempo real, os dados que você vai obter ao executar esse código serão diferentes dos mostrados nest exemplo!!!

Podemos passar o JSON para um dataframe de forma a facilitar a leitura dos dados:

In [14]:
df = pd.DataFrame(data)

Explorando os dados obtidos:

In [15]:
df.head(6)

Unnamed: 0,url,repository_url,labels_url,comments_url,events_url,html_url,id,node_id,number,title,...,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,timeline_url,performed_via_github_app,state_reason
0,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/pull/55297,1913395391,PR_kwDOAA0YD85bOaTP,55297,ExtensionArray.interpolate() method and tests,...,,NONE,,False,{'url': 'https://api.github.com/repos/pandas-d...,- [ ] closes #xxxx (Replace xxxx with the GitH...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
1,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55296,1913217005,I_kwDOAA0YD85yCV_t,55296,BUG: pandas.DataFrame.to_parquet() causing mem...,...,,NONE,,,,### Pandas version checks\r\n\r\n- [X] I have ...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
2,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55295,1913209938,I_kwDOAA0YD85yCURS,55295,BUG: compatibility between pd.Timestamp() and ...,...,,NONE,,,,### Pandas version checks\n\n- [X] I have chec...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
3,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55293,1913022771,I_kwDOAA0YD85yBmkz,55293,BUG: `date_range` `inclusive` parameter behavi...,...,,NONE,,,,### Pandas version checks\n\n- [X] I have chec...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
4,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55289,1912154899,I_kwDOAA0YD85x-SsT,55289,ENH: Improve Filter function with Filter_Colum...,...,,NONE,,,,### Feature Type\n\n- [X] Adding new functiona...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
5,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55287,1911999272,I_kwDOAA0YD85x9sso,55287,QST: get the last one value in a dataframe's c...,...,,NONE,,,,### Research\n\n- [X] I have searched the [[pa...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,


In [16]:
df.tail(3)

Unnamed: 0,url,repository_url,labels_url,comments_url,events_url,html_url,id,node_id,number,title,...,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,timeline_url,performed_via_github_app,state_reason
27,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55247,1909147636,I_kwDOAA0YD85xy0f0,55247,BUG: NumbaPendingDeprecationWarning with numba...,...,,NONE,,,,### Pandas version checks\r\n\r\n- [X] I have ...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
28,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/pull/55246,1909057662,PR_kwDOAA0YD85a_9uu,55246,Create broken-linkcheck.yml,...,,CONTRIBUTOR,,False,{'url': 'https://api.github.com/repos/pandas-d...,Created a Github Action to run the Sphinx link...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,
29,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://api.github.com/repos/pandas-dev/pandas...,https://github.com/pandas-dev/pandas/issues/55245,1908932126,I_kwDOAA0YD85xx_4e,55245,PERF: hash_pandas_object performance regressio...,...,,NONE,,,,### Pandas version checks\n\n- [X] I have chec...,{'url': 'https://api.github.com/repos/pandas-d...,https://api.github.com/repos/pandas-dev/pandas...,,


In [18]:
df.shape

(30, 30)

In [19]:
df.columns

Index(['url', 'repository_url', 'labels_url', 'comments_url', 'events_url',
       'html_url', 'id', 'node_id', 'number', 'title', 'user', 'labels',
       'state', 'locked', 'assignee', 'assignees', 'milestone', 'comments',
       'created_at', 'updated_at', 'closed_at', 'author_association',
       'active_lock_reason', 'draft', 'pull_request', 'body', 'reactions',
       'timeline_url', 'performed_via_github_app', 'state_reason'],
      dtype='object')

Vamos supor que para a análise vamos precisar somente de algumas colunas, podemos criar um daframe somente com elas:

In [20]:
issues = pd.DataFrame(data, columns=["id", "number", "title", "labels", "state", 'created_at'])

In [21]:
issues.head()

Unnamed: 0,id,number,title,labels,state,created_at
0,1913395391,55297,ExtensionArray.interpolate() method and tests,[],open,2023-09-26T12:22:54Z
1,1913217005,55296,BUG: pandas.DataFrame.to_parquet() causing mem...,"[{'id': 76811, 'node_id': 'MDU6TGFiZWw3NjgxMQ=...",open,2023-09-26T10:38:12Z
2,1913209938,55295,BUG: compatibility between pd.Timestamp() and ...,"[{'id': 76811, 'node_id': 'MDU6TGFiZWw3NjgxMQ=...",open,2023-09-26T10:33:35Z
3,1913022771,55293,BUG: `date_range` `inclusive` parameter behavi...,"[{'id': 76811, 'node_id': 'MDU6TGFiZWw3NjgxMQ=...",open,2023-09-26T08:52:47Z
4,1912154899,55289,ENH: Improve Filter function with Filter_Colum...,"[{'id': 76812, 'node_id': 'MDU6TGFiZWw3NjgxMg=...",open,2023-09-25T19:31:26Z


E para exportar este dataframe para um arquivo .csv :

In [23]:
issues.to_csv("/content/issues.csv", encoding = 'utf-8')