# Consignes de départ

Afin de tester l'API, il s'agit de réaliser :
- une requête de 50 questions
- sur une période définie
- contenant le tag "python"
- ayant un score > 50
- respecter les principes RGPD en ne récupérant que les données suivantes :
    - date
    - titre
    - tags
    - score
- les mettre dans un DataFrame
- et les afficher

Le choix est fait d'utiliser le wrapper Python StackAPI.

# RGPD

Aucune information permettant d'identifier les personnes (auteurs des questions) n'est ici présente.

# Sources

[Documentation Stack Api](https://stackapi.readthedocs.io/en/latest/user/advanced.html#calling-fetch-with-various-api-parameters)

[Création de filtre (questions)](https://api.stackexchange.com/docs/questions#order=desc&sort=activity&filter=default&site=stackoverflow&run=true) : éditer l'API "Try it" en bas de page

# Requête à l'API

In [1]:
from stackapi import StackAPI
from datetime import datetime as dt, timedelta as td


SITE = StackAPI("stackoverflow")
# limit the number of results to 50 per page and only get the first page
# 👍 hits the API only once
SITE.page_size = 50
SITE.max_pages = 1

questions = SITE.fetch(
    "questions",
    # adds a filter to the request to only get the fields we need
    filter="!*1PUVE3_.Beefvtn(y7-EH8.RmJ73FaI-xo97o5gO",
    # only get questions from the last 3 years
    fromdate=dt.now() - td(days=3*365),
    # with a score of at least 50
    min=50,
    # with the python tag
    tagged="python",
    # sorted by votes
    sort="votes",
)

print(f"✅ {len(questions['items'])} questions found")

✅ 50 questions found


# Conversion en dataframe

In [2]:
import pandas as pd


# store results in a pandas dataframe
df = pd.DataFrame(questions["items"])
# reorder dataframe columns in creation_date, title, tags, score
df = df[["creation_date", "title", "tags", "score"]]
# convert creation_date to a readable date
df["creation_date"] = pd.to_datetime(df["creation_date"], unit="s")

# Affichage des questions

In [3]:
display(df)

Unnamed: 0,creation_date,title,tags,score
0,2023-03-01 19:52:19,How do I solve &quot;error: externally-managed...,"[python, error-handling, pip]",244
1,2022-03-23 18:02:02,How can I fix the &quot;zsh: command not found...,"[python, macos, terminal, atom-editor, macos-m...",209
2,2023-04-07 07:05:59,Error &quot;&#39;DataFrame&#39; object has no ...,"[python, pandas, dataframe, attributeerror]",203
3,2022-05-31 02:47:35,TypeError: Descriptors cannot not be created d...,"[python, tensorflow, ray]",203
4,2022-03-30 07:46:15,ImportError: cannot import name &#39;_unicodef...,"[python, python-black]",178
5,2021-12-11 22:44:15,ImportError: cannot import name &#39;url&#39; ...,"[python, django, django-urls, django-4.0]",177
6,2021-04-06 15:23:41,Auto-create primary key used when not defining...,"[python, python-3.x, django]",172
7,2021-06-27 15:36:29,"Understand Python swapping: why is a, b = b, a...","[python, list, indexing, swap]",172
8,2022-01-25 15:09:43,Does it make sense to use Conda + Poetry?,"[python, machine-learning, package, conda, pyt...",162
9,2022-01-24 16:46:44,"Good alternative to Pandas .append() method, n...","[python, pandas, dataframe, data-wrangling, da...",156
