自由記述を読みやすくする

- 質問ごとに回答をJSONで書き出し、Typstのデータ読み込み機能を使って、PDFに変換する

```console
#line(width=100%)

回答（英語）
翻訳（日本語）

回答者の属性（年代、性別、地域、肩書き）
回答の感情分析（極性、主体性）

...の繰り返し構造
```

自由記述の質問

- ``q15``: 【Q15】Please let us know If your group has any good practice examples related to DE&I ?
- ``q16``: 【Q16】Please let us know if there is anything your group needs to work on or if your group has any problems related to DE&I.
- ``q18``: 【Q18】Could you tell us more about your thoughts (agree / disagree) ?
- ``q20``: 【Q20】Do you have any concerns / problems related to DE&I initiatives in science ?
- ``q21``: 【Q21】What reasons do you think are hindering DE&I initiatives in science ?
- ``q22``:【Q22】Comments

属性

- ``q1``: 【Q1】What is your age ?
- ``q2``: 【Q2】What gender do you identify as ?
- ``q3``: 【Q3】Which geographical region are you currently working or attending school/university in ?
- ``q5``: 【Q5】What is your job title ?
- ``q6``: 【Q6】Which group do you belong to ? 
- ``q7``: 【Q7】What is your research type ?

In [1]:
from pathlib import Path
import pandas as pd
import altair as alt
import titanite as ti

print(f"Pandas : {pd.__version__}")
print(f"Altair : {alt.__version__}")
print(f"Titanite : {ti.__version__}")

Pandas : 2.0.3
Altair : 5.0.1
Titanite : 0.1.2


In [2]:
f_cfg = "../sandbox/config.toml"
f_csv = "../sandbox/tmp_preprocessed.csv"

cfg = ti.Config(fname=f_cfg)
cfg.load()
# config.questions
category = cfg.categories()

data = pd.read_csv(f_csv, parse_dates=["timestamp"])
data = ti.categorical_data(data, category)

In [6]:
q15 = data.dropna(subset="q15")
len(q15)

51

In [7]:
q16 = data.dropna(subset="q16")
len(q16)

51

In [8]:
q18 = data.dropna(subset="q18")
len(q18)

62

In [23]:
attributes = ["q1", "q2", "q3", "q5", "q6", "q7"]
headers = ["q15", "q16", "q18", "q20", "q21", "q22"]

for header in headers:
    q = data.dropna(subset=header)
    # print(f"{header}: {len(q)}")
    h = attributes + [header, f"{header}_ja", f"{header}_polarity", f"{header}_subjectivity"]
    #fname = f"../data/typst/comment_{header}.csv"
    #q[h].to_csv(fname, index=False)
    fname = f"../data/typst/comment_{header}.json"
    q[h].sort_values(by=header).to_json(fname, orient="records")
    print(f"Saved as {fname}")

Saved as ../data/typst/comment_q15.json
Saved as ../data/typst/comment_q16.json
Saved as ../data/typst/comment_q18.json
Saved as ../data/typst/comment_q20.json
Saved as ../data/typst/comment_q21.json
Saved as ../data/typst/comment_q22.json
