Zadání 6. Analýza projektů (gql_ug + gql_projects)
- Vytvořit GQL dotaz na základě existující federace,
- Definovat transformaci GQL response -> table rows (vstup pro kontingenční tabulku)
- Vytvořit kontingenční tabulku
- Vytvořit koláčový / sloupcový graf
- Vytvořit Sunburst / Chord graf

Výsledek realizujte jako ipynb notebook (autentizace jménem a heslem, realizace aiohttp, transformace response, vytvoření tabulky, vytvoření grafu).

Instalace potřebných knihoven

In [2]:
%pip install pandas aiohttp plotly.express asyncio nbformat

Collecting pandasNote: you may need to restart the kernel to use updated packages.

  Downloading pandas-2.2.2-cp312-cp312-win_amd64.whl.metadata (19 kB)
Collecting aiohttp
  Downloading aiohttp-3.9.5-cp312-cp312-win_amd64.whl.metadata (7.7 kB)
Collecting plotly.express
  Downloading plotly_express-0.4.1-py2.py3-none-any.whl.metadata (1.7 kB)
Collecting asyncio
  Downloading asyncio-3.4.3-py3-none-any.whl.metadata (1.7 kB)
Collecting nbformat
  Downloading nbformat-5.10.4-py3-none-any.whl.metadata (3.6 kB)
Collecting numpy>=1.26.0 (from pandas)
  Using cached numpy-1.26.4-cp312-cp312-win_amd64.whl.metadata (61 kB)
Collecting pytz>=2020.1 (from pandas)
  Downloading pytz-2024.1-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Downloading tzdata-2024.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting aiosignal>=1.1.2 (from aiohttp)
  Downloading aiosignal-1.3.1-py3-none-any.whl.metadata (4.0 kB)
Collecting attrs>=17.3.0 (from aiohttp)
  Using cached attrs-

In [2]:
import aiohttp
import asyncio
import json
import plotly.express as px
import pandas as pd
from itertools import product
from functools import reduce

Funkce pro získání tokenu


In [3]:
async def getToken(username, password):
    keyurl = "http://localhost:33001/oauth/login3"
    async with aiohttp.ClientSession() as session:
        async with session.get(keyurl) as resp:
            keyJson = await resp.json()

        payload = {"key": keyJson["key"], "username": username, "password": password}
        async with session.post(keyurl, json=payload) as resp:
            tokenJson = await resp.json()
    return tokenJson.get("token", None)

Funkce pro definici GraphQL dotazu

In [4]:
def query(q, token):
    async def post(variables):
        gqlurl = "http://localhost:33001/api/gql"
        payload = {"query": q, "variables": variables}
        cookies = {'authorization': token}
        async with aiohttp.ClientSession() as session:
            async with session.post(gqlurl, json=payload, cookies=cookies) as resp:
                if resp.status != 200:
                    text = await resp.text()
                    print(text)
                    return text
                else:
                    response = await resp.json()
                    return response
    return post

Pomocné funkce pro zpracování dat

In [5]:
def enumerateAttrs(attrs):
    for key, value in attrs.items():
        names = value.split(".")
        name = names[0]
        yield key, name

def flattenList(inList, outItem, attrs):
    for item in inList:
        assert isinstance(item, dict), f"in list only dicts are expected"
        for row in flatten(item, outItem, attrs):
            yield row

def flattenDict(inDict, outItem, attrs):
    result = {**outItem}
    complexAttrs = []
    for key, value in enumerateAttrs(attrs):
        attributeValue = inDict.get(value, None)
        if isinstance(attributeValue, list):
            complexAttrs.append((key, value))
        elif isinstance(attributeValue, dict):
            complexAttrs.append((key, value))
        else:
            result[key] = attributeValue
    lists = []
    for key, value in complexAttrs:
        attributeValue = inDict.get(value, None)
        prefix = f"{value}."
        prefixlen = len(prefix)
        subAttrs = {key: value[prefixlen:] for key, value in attrs.items() if value.startswith(prefix)}
        items = list(flatten(attributeValue, result, subAttrs))
        lists.append(items)
                     
    if len(lists) == 0:
        yield result
    else:
        for element in product(*lists):
            reduced = reduce(lambda a, b: {**a, **b}, element, {})
            yield reduced

def flatten(inData, outItem, attrs):
    if isinstance(inData, dict):
        for item in flattenDict(inData, outItem, attrs):
            yield item
    elif isinstance(inData, list):
        for item in flattenList(inData, outItem, attrs):
            yield item
    else:
        assert False, f"Unexpected type on inData {inData}"

Přihlašovací údaje

In [6]:
username = "john.newbie@world.com"
password = "john.newbie@world.com"

GraphQL dotaz

In [7]:
queryStr = """
{
  projectPage {
    id
    name
    startdate
    enddate
    valid
    projectType {
      id
      name
    }
    milestones {
      id
    }
    group {
      id
      name
    }
  }
}
"""

In [8]:
mappers = {
    "projectID": "id",
    "projectName": "name",
    "projectTypeID": "projectType.id",
    "projectType": "projectType.name",
    "startDate": "startdate",
    "endDate": "enddate",
    "validity": "valid",
    "milestonesCount": "milestonesCount",
    "groupID": "group.id",
    "groupName": "group.name",
}

In [9]:
async def fullPipe():
    token = await getToken(username, password)
    qfunc = query(queryStr, token)
    response = await qfunc({})

    data = response.get("data", None)
    result = data.get("projectPage", None)
    
    resultMapped = list(map(lambda project: {**project, "milestonesCount": len(project.get("milestones", []))}, result))
    flatData = flatten(resultMapped, {}, mappers)
    return list(flatData)

async def main():
    flatData = await fullPipe()
    with open('resultNotebook.json', "w", encoding='utf-8') as outputFile:
        json.dump(flatData, outputFile)

await main()

In [10]:
with open("resultNotebook.json", "r") as file: #pro reálné data nahradit resultNotebook.json
    data = json.load(file)

pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
df = pd.DataFrame(data)

print(df)

                              projectID                   projectName            startDate              endDate  validity  milestonesCount                         projectTypeID projectType                               groupID groupName
0  43dd2ff1-5c17-42a5-ba36-8b30e2a243bb  Nukleární reaktor pro budovy  2023-01-01T17:27:12  2025-12-31T17:27:12      True                2  a825d8e1-2e60-4884-afdb-25642db581d8        GAČR  2d9dcd22-a4a2-11ed-b9df-0242ac120003       Uni
