# 4. Requête API.

In [1]:
import numpy as np
import pandas as pd
from stackapi import StackAPI

Nous allons extraires 50 questions publiées sur `stackoverflow.com` entre le 1er mars et le 31 décembre 2020 (période du covid19) contenant le Tag *python*, classées par scores décroissants.

Les deux dates seront converties au format Unix timestamp.

In [2]:
SITE = StackAPI('stackoverflow')
SITE.page_size = 50  # Nombre de résultats par page
SITE.max_pages = 1 # Nombre de pages

In [3]:
questions = SITE.fetch('questions', tagged='python', sort='votes',
                       fromdate=1583020800, todate=1609372800, filter='withbody')

Nous allons maintenant stocker ces questions dans un dataframe. On filtre les questions qui ont un score supérieur à 50 et on sélectionne les features suivantes : *Date, Titre, Question, Tags, Score* .

In [4]:
# Extraction des données pertinentes :
data = []
for question in questions['items']:
    if question['score'] > 50:
        question_data = {
            'Date': pd.to_datetime(question['creation_date'], unit='s'),
            'Titre': question['title'],
            'Question' : question['body'],
            'Tags': question['tags'],
            'Score': question['score']
        }
        data.append(question_data)

# Stockage dans un dataframe :
data = pd.DataFrame(data)

In [5]:
data.head()

Unnamed: 0,Date,Titre,Question,Tags,Score
0,2020-07-19 17:51:19,What is pyproject.toml file for?,<h3>Background</h3>\n<p>I was about to try Pyt...,"[python, pip, packaging, pyproject.toml]",373
1,2020-12-11 15:53:36,xlrd.biffh.XLRDError: Excel xlsx file; not sup...,<p>I am trying to read a macro-enabled Excel w...,"[python, pandas, xlrd, pcf]",293
2,2020-07-02 02:56:46,sqlalchemy.exc.NoSuchModuleError: Can&#39;t lo...,<p>I'm trying to connect to a Postgres databas...,"[python, postgresql, sqlalchemy, flask-sqlalch...",240
3,2020-11-22 08:53:15,docker.errors.DockerException: Error while fet...,<p>I want to install this module but there is ...,"[python, linux, docker, docker-compose]",238
4,2020-12-17 21:55:37,Python was not found; run without arguments to...,"<p>I was trying to download a GUI, but the ter...","[python, python-3.x, windows-10]",223


In [6]:
data.shape

(50, 5)

On trie les questions par date, de la plus récente à la plus ancienne :

In [7]:
result = data.sort_values(by='Date', ascending=False)
result.head()

Unnamed: 0,Date,Titre,Question,Tags,Score
4,2020-12-17 21:55:37,Python was not found; run without arguments to...,"<p>I was trying to download a GUI, but the ter...","[python, python-3.x, windows-10]",223
14,2020-12-15 00:05:16,What does this tensorflow message mean? Any si...,<p>I just installed tensorflow v2.3 on anacond...,"[python, tensorflow, anaconda]",149
1,2020-12-11 15:53:36,xlrd.biffh.XLRDError: Excel xlsx file; not sup...,<p>I am trying to read a macro-enabled Excel w...,"[python, pandas, xlrd, pcf]",293
10,2020-12-11 11:08:36,Pandas cannot open an Excel (.xlsx) file,<p>Please see my code below:</p>\n<pre><code>i...,"[python, excel, pandas]",171
40,2020-12-09 02:50:21,Pydantic enum field does not get converted to ...,<p>I am trying to restrict one field in a clas...,"[python, serialization, fastapi, pydantic]",88
