<h2>Zadání 6. Analýza projektů (gql_ug + gql_projects)</h2>

- Vytvořit GQL dotaz na základě existující federace,
- Definovat transformaci GQL response -> table rows (vstup pro kontingenční tabulku)
- Vytvořit kontingenční tabulku
- Vytvořit koláčový / sloupcový graf
- Vytvořit Sunburst / Chord graf
- Výsledek realizujte jako ipynb notebook (autentizace jménem a heslem, realizace aiohttp, transformace response, vytvoření tabulky, vytvoření grafu).

<b>Instalce potřebných knihoven</b>

In [1]:
%pip install pandas aiohttp plotly.express asyncio nbformat

Note: you may need to restart the kernel to use updated packages.


ERROR: Could not find a version that satisfies the requirement json (from versions: none)
ERROR: No matching distribution found for json


<b>Importy a inicializace</b>

In [2]:
import aiohttp
import asyncio
import json
import plotly.express as px
import pandas as pd
from itertools import product
from functools import reduce

<b>Funkce pro získání tokenu</b>

In [3]:
async def getToken(username, password):
    keyurl = "http://localhost:33001/oauth/login3"
    async with aiohttp.ClientSession() as session:
        async with session.get(keyurl) as resp:
            keyJson = await resp.json()

        payload = {"key": keyJson["key"], "username": username, "password": password}
        async with session.post(keyurl, json=payload) as resp:
            tokenJson = await resp.json()
    return tokenJson.get("token", None)

<b>Funkce pro definici GraphQL dotazu</b>

In [4]:
def query(q, token):
    async def post(variables):
        gqlurl = "http://localhost:33001/api/gql"
        payload = {"query": q, "variables": variables}
        cookies = {'authorization': token}
        async with aiohttp.ClientSession() as session:
            async with session.post(gqlurl, json=payload, cookies=cookies) as resp:
                if resp.status != 200:
                    text = await resp.text()
                    print(text)
                    return text
                else:
                    response = await resp.json()
                    return response
    return post

<b>Pomocné funkce pro zpracování dat</b>

In [5]:
def enumerateAttrs(attrs):
    for key, value in attrs.items():
        names = value.split(".")
        name = names[0]
        yield key, name

def flattenList(inList, outItem, attrs):
    for item in inList:
        assert isinstance(item, dict), f"in list only dicts are expected"
        for row in flatten(item, outItem, attrs):
            yield row

def flattenDict(inDict, outItem, attrs):
    result = {**outItem}
    complexAttrs = []
    for key, value in enumerateAttrs(attrs):
        if value == "milestones":
            result[key] = len(inDict.get("milestones", []))
        else:
            attributeValue = inDict.get(value, None)
            if isinstance(attributeValue, list):
                complexAttrs.append((key, value))
            elif isinstance(attributeValue, dict):
                complexAttrs.append((key, value))
            else:
                result[key] = attributeValue
    lists = []
    for key, value in complexAttrs:
        attributeValue = inDict.get(value, None)
        prefix = f"{value}."
        prefixlen = len(prefix)
        subAttrs = {key: value[prefixlen:] for key, value in attrs.items() if value.startswith(prefix)}
        items = list(flatten(attributeValue, result, subAttrs))
        lists.append(items)
                     
    if len(lists) == 0:
        yield result
    else:
        for element in product(*lists):
            reduced = reduce(lambda a, b: {**a, **b}, element, {})
            yield reduced

def flatten(inData, outItem, attrs):
    if isinstance(inData, dict):
        for item in flattenDict(inData, outItem, attrs):
            yield item
    elif isinstance(inData, list):
        for item in flattenList(inData, outItem, attrs):
            yield item
    else:
        assert False, f"Unexpected type on inData {inData}"

<b>Přihlašovací údaje</b>

In [6]:
username = "john.newbie@world.com"
password = "john.newbie@world.com"

<b>GraphQL dotaz</b>

In [7]:
queryStr = """
{
  projectPage {
    id
    name
    startdate
    enddate
    valid
    projectType {
      id
      name
    }
    milestones {
      id
    }
    group {
      id
      name
    }
  }
}
"""

<b>Mapování atributů</b>

In [8]:
mappers = {
    "projectID": "id",
    "projectName": "name",
    "startDate": "startdate",
    "endDate": "enddate",
    "validity": "valid",
    "projectTypeID": "projectType.id",
    "projectType": "projectType.name",
    "milestonesCount": "milestones",
    "groupID": "group.id",
    "groupName": "group.name",
}

<b>Asynchronní funkce pro celý proces</b>

In [9]:
async def fullPipe():
    token = await getToken(username, password)
    qfunc = query(queryStr, token)
    response = await qfunc({})

    data = response.get("data", None)
    result = data.get("projectPage", None)

    flatData = flatten(result, {}, mappers)
    return list(flatData)

async def main():
    flatData = await fullPipe()
    with open('resultNotebook.json', "w", encoding='utf-8') as outputFile:
        json.dump(flatData, outputFile)

await main()

<h2>Tabulka</h2>

In [10]:
with open("resultFake.json", "r") as file: #pro reálné data nahradit resultNotebook.json
    data = json.load(file)

pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
df = pd.DataFrame(data)

print(df)

                               projectID                     projectName                   startDate                     endDate  validity                         projectTypeID projectType  milestonesCount                               groupID   groupName
0   43dd2ff1-5c17-42a5-ba36-8b30e2a243bb    Nukleární reaktor pro budovy         2023-01-01T17:27:12         2025-12-31T17:27:12     False  a825d8e1-2e60-4884-afdb-25642db581d8        GAČR                5  2d9dcd22-a4a2-11ed-b9df-0242ac120003         Uni
1   22586df2-2cf0-4549-ae27-5df7800a2b0b        Větrná turbína pro města  2023-05-21T08:38:52.038899  2026-05-20T08:38:52.038915     False  a825d8e1-2e60-4884-afdb-25642db581d8        GAČR                4  2d9dcd22-a4a2-11ed-b9df-0242ac120003         Uni
2   d765d9ca-0eb0-4c9b-977b-52b59d18a6a1          Inteligentní osvětlení  2023-11-02T08:38:52.038979  2025-10-02T08:38:52.038982      True   c009fed2-6c5-4668-bb60-949dd1b6b3ed        TAČR                3  3f7dcd22-a4a2-11ed-b9df-0

<h2>Tvorba grafů</h2>

<b>Zpracování falešných dat</b>

In [11]:
def createTableRow(project):
    row = {}
    row["projectID"] = project["id"]
    row["projectName"] = project["name"]
    row["startDate"] = project["startdate"]
    row["endDate"] = project["enddate"]
    row["validity"] = project["valid"]

    row["projectTypeID"] = project["projectType"]["id"]
    row["projectType"] = project["projectType"]["name"]

    row["milestonesCount"] = len(project['milestones'])

    row["groupID"] = project["group"]["id"]
    row["groupName"] = project["group"]["name"]
    
    return row


with open('dataFake.json', encoding='utf-8') as inputFile:
    data = json.load(inputFile)


sourceTable = []


for project in data["data"]["projectPage"]:
    row = createTableRow(project)
    sourceTable.append(row)


with open('resultFake.json', "w", encoding='utf-8') as outputFile:
    json.dump(sourceTable, outputFile)

<b>Vytvoření grafů</b>

In [12]:
with open("resultFake.json", "r") as file: #pro reálné data nahradit resultNotebook.json
    data = json.load(file)

df = pd.DataFrame(data)

fig_sunburst = px.sunburst(
    df,
    path=['groupName', 'projectName', 'milestonesCount'],
    values='milestonesCount', 
    title='Project Distribution by Group, Project Name, and Milestones Count'
)

fig_sunburst.update_layout(
    width=800,  
    height=800 
)

df_bar = df.groupby(['groupName', 'validity']).size().reset_index(name='counts')

fig_bar = px.bar(
    df_bar,
    x='groupName',
    y='counts',
    color='validity',
    barmode='group',
    title='Number of Projects by Group and Validity',
    labels={'groupName': 'Group Name', 'validity': 'Validity', 'counts': 'Number of Projects'}
)

fig_bar.update_layout(
    yaxis=dict(
        tickmode='linear',
        dtick=1
    )
)

fig_sunburst.show()
fig_bar.show()