# Getting feedback on the entire JASMIN docs stack from GPT5

Redo as follows:
- [ ] Parse latest docs
- [ ] Structure them as input
- [ ] Improve the prompt
- [ ] Ask the prompt to provide Markdown content, and to separate out:
    - [ ] Summary
    - [ ] Structural recommendations
    - [ ] Typos and formatting issues
    - [ ] Factual inconsistencies
- [ ] Ask it to point to the relevant pages and sections where changes are recommended
- [ ] Ask it to tabulate its recommendations
- [ ] Ask it to make suggestions for rewording

In [57]:
# Import openai library and other necessary modules 
# Then write a prompt expanded from:
# 1. I'm going to send you a large document
# 2. Create an internal summary of all key information
# 3. Identify any inconsistencies or errors in the document
# 4. Suggest corrections or improvements to enhance clarity and coherence
# Send this prompt to the OpenAI API, along with the document (which will be a large string)

from pathlib import Path
from openai import OpenAI
import pandas as pd
from dotenv import load_dotenv
load_dotenv(Path("../../../../../../msc-project/work/.env"))

client = OpenAI()

# Reformat data into a prompt for the OpenAI API
prompt_template = """
1. I'm going to send you a large document
2. Create an internal summary of all key information
3. Identify any inconsistencies or errors in the document
4. Suggest corrections or improvements to enhance clarity and coherence

Here is the documentation I would like you to check:

<DOCUMENTATION_START>

{data}

<DOCUMENTATION_END>
"""

def send(data, model="gpt-5-mini"):
        # Send the prompt and document to the OpenAI API
    response = client.responses.create(
        model=model,
        input=prompt_template.format(data=data)
    )
    return response  


In [41]:
data_path = Path("../../../../../../../ai-mini-projects/jasmin-docs-chatbot/jasmin_docs.csv")
data = pd.read_csv(data_path)

In [42]:
data = data[data.contents.notnull()] # Remove NaNs

content = []

for _, row in data.iterrows():
    content.append("<SECTION>\n")
    content.append(f"<TITLE>\n{row.title}\n</TITLE>\n")
    content.append(f"<CONTENT>\n{row.contents}\n</CONTENT>\n")
    content.append("</SECTION>\n\n")

content = "\n".join(content)

In [43]:
len(content)

741939

In [44]:
resp = send(content[:100000], model="gpt-5")

In [51]:
print(resp.output_text)

Below is an internal, actionable review of your documentation. It includes: (1) a concise summary of key information, (2) a structured list of inconsistencies and errors, and (3) concrete suggestions/corrections to improve clarity and coherence.

1) Internal summary of key information

About the site and support
- Docs use richer formatting; navigation via top nav, breadcrumbs, sidebar, ToC, tags, and inter-article links.
- Report doc site problems via the GitHub issues link; each page shows last updated date and commit details.
- For other issues/questions, use the JASMIN Help beacon “Ask” contact form (bottom-right).

Account types
- Standard account: one human user only; manages SSH key/password/roles via Accounts Portal; key must be unique and traceable; can be a responsible user for shared/service accounts; training accounts are short-term standard accounts preset for events.
- Shared account: used by a small, defined set of responsible users (each uses own SSH key); shared accoun

In [54]:
resp_100_000 = resp
len(resp_100_000.output_text)

17729

In [58]:
resp_741_939 = send(content)

In [59]:
output = resp_741_939.output_text

In [60]:
print(output)

Summary — key information (internal)
- Purpose/Scope
  - This is the JASMIN user documentation covering: accounts, access, storage, transfers, interactive & batch compute, services (Group Workspaces, Object Store, XFC, Elastic Tape (ET), JDMA, JDMA deprecation → NLDS, Dask Gateway, Pangeo, Notebooks, Globus, GridFTP, rclone, bbcp, rsync/scp/sftp, MobaXterm, NoMachine NX, VSCode remote), Cluster-as-a-Service (Kubernetes, Slurm, NFS), and admin/manager workflows (consortium/GWS managers, project portals).
  - Mixes user-level how-to, service descriptions, admin procedures, and system operation notes. Many legacy / deprecated items are present.

- Accounts & authentication
  - Account types: STANDARD (individual; training accounts short-term), SHARED (multiple responsible users with their own SSH keys; shared password allowed but must be shared securely), SERVICE (for daemons/functions; no SSH key; cannot reset password).
  - JASMIN Accounts Portal is central (create account, upload publi

In [61]:
print(dir(resp_741_939))

['__abstractmethods__', '__annotations__', '__class__', '__class_getitem__', '__class_vars__', '__copy__', '__deepcopy__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__fields__', '__fields_set__', '__format__', '__ge__', '__get_pydantic_core_schema__', '__get_pydantic_json_schema__', '__getattr__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__pretty__', '__private_attributes__', '__pydantic_complete__', '__pydantic_computed_fields__', '__pydantic_core_schema__', '__pydantic_custom_init__', '__pydantic_decorators__', '__pydantic_extra__', '__pydantic_fields__', '__pydantic_fields_set__', '__pydantic_generic_metadata__', '__pydantic_init_subclass__', '__pydantic_parent_namespace__', '__pydantic_post_init__', '__pydantic_private__', '__pydantic_root_model__', '__pydantic_serializer__', '__pydantic_validator__', '__reduce__', '__reduce_ex__', '__replace__', '