microsoft · gvanrossum-ms · Oct 15, 2025 · Oct 15, 2025 · Oct 15, 2025 · Oct 15, 2025
diff --git a/docs/README.md b/docs/README.md
@@ -1,14 +1,13 @@
-# Typeagent docs
+# Typeagent Docs
 
-## TBD
+## Basics
 
-For now we have:
+- [Getting Started](getting-started.md)
+- [High-level API](high-level-api.md)
+- [Environment Variables](env-vars.md)
 
-### High-level API:
+## Advanced
 
-- create_conversation
-- [conversation.query](query-method.md)
-
-### Other
-
-- [architecture design](typeagent-architecture.md)
+- [Reproducing the Demos](demos.md)
+- [Downloading GMail Messages](gmail.md)
+- [Developing and Contributing](developing.md)
diff --git a/docs/demos.md b/docs/demos.md
@@ -0,0 +1,3 @@
+# How to Reproduce the Demos
+
+This will be revealed after [PyBay 2025](https://pybay.org/).
diff --git a/docs/developing.md b/docs/developing.md
@@ -0,0 +1,12 @@
+# Developing and Contributing
+
+**Always follow the code of conduct, see [Code of Conduct](../CODE_OF_CONDUCT.md).**
+
+To contribute, submit issues or PRs to
+[our repo](https://github.com/microsoft/typeagent-py).
+
+To develop, for now you're on your own.
+We use [uv](https://docs.astral.sh/uv/) for some things.
+Check out the [Makefile](../Makefile) for some recipes.
+
+More TBD.
diff --git a/docs/env-vars.md b/docs/env-vars.md
@@ -0,0 +1,44 @@
+# Environment Variables
+
+No LLM-using application today works without API tokens and/or other
+authentication secrets. These are almost always passed via environment
+variables.
+
+Typeagent currently supports two families of environment variables:
+
+- Those for (public) OpenAI servers.
+- Those for the Azure OpenAI service.
+
+## OPENAI environment variables
+
+The (public) OpenAI environment variables include:
+
+- `OPENAI_API_KEY`: Your secret API key that you get from the
+  [OpenAI dashboard](https://platform.openai.com/api-keys).
+- `OPENAI_MODEL`: An environment variable introduced by
+  [TypeChat](https://microsoft.github.io/TypeChat/docs/examples/)
+  indicating the model to use (e.g.`gpt-4o`).
+
+## Azure OpenAI environment variables
+
+If you are using the OpenAI service hosted by Azure, you need different
+environment variables, starting with:
+
+- `AZURE_OPENAI_API_KEY`: Your Azure OpenAI API key.
+- `AZURE_OPENAI_ENDPOINT`: The full URL of the Azure OpenAI REST API
+  (e.g. https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2023-05-15).
+- If you use Azure OpenAI you will know where to get these
+  (or ask your sysadmin).
+
+## Conflicts
+
+If you set both `OPENAI_API_KEY` and `AZURE_OPENAI_API_KEY`,
+plain `OPENAI` will win.
+
+## Other ways to specify environment variables
+
+It is recommended to put your environment variables in a file named
+`.env` in the current or parent directory.
+To pick up these variables, call `typeagent.aitools.utils.load_dotenv()`
+at the start of your program (before calling any typeagent functions).
+(For simplicity this is not shown in [Getting Started](getting-started.md).)
diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -0,0 +1,136 @@
+# Getting Started
+
+## Installation
+
+```sh
+$ pip install typeagent
+```
+
+You might also want to use a
+[virtual environment](https://docs.python.org/3/library/venv.html)
+or another tool like [poetry](https://python-poetry.org/)
+or [uv](https://docs.astral.sh/uv/), as long as your tool can
+install wheels from [PyPI](https://pypi.org).
+
+## "Hello world" ingestion program
+
+### 1. Create a text file named `transcript.txt`
+
+```txt
+STEVE We should really make a Python library for Structured RAG.
+UMESH Who would be a good person to do the Python library?
+GUIDO I volunteer to do the Python library. Give me a few months.
+```
+
+### 2. Create a Python file named `demo.py`
+
+```py
+from typeagent import create_conversation
+from typeagent.transcripts.transcript import (
+    TranscriptMessage,
+    TranscriptMessageMeta,
+)
+
+
+def read_messages(filename) -> list[TranscriptMessage]:
+    messages: list[TranscriptMessage] = []
+    with open(filename, "r") as f:
+        for line in f:
+            # Parse each line into a TranscriptMessage
+            speaker, text_chunk = line.split(None, 1)
+            message = TranscriptMessage(
+                text_chunks=[text_chunk],
+                metadata=TranscriptMessageMeta(speaker=speaker),
+            )
+            messages.append(message)
+    return messages
+
+
+async def main():
+    conversation = await create_conversation("demo.db", TranscriptMessage)
+    messages = read_messages("transcript.txt")
+    print(f"Indexing {len(messages)} messages...")
+    results = await conversation.add_messages_with_indexing(messages)
+    print(f"Indexed {results.messages_added} messages.")
+    print(f"Got {results.semrefs_added} semantic refs.")
+
+
+if __name__ == "__main__":
+    import asyncio
+    asyncio.run(main())
+```
+
+### 3. Set up your environment for using OpenAI
+
+The minimal set of environment variables is:
+
+```sh
+export OPENAI_API_KEY=your-very-secret-openai-api-key
+export OPENAI_MODEL=gpt-4o
+```
+
+Some OpenAI setups will require some additional environment variables.
+See [Environment Variables](env-vars.md) for more information.
+You will also find information there on how to use
+Azure-hosted OpenAI models.
+
+### 4. Run your program
+
+```sh
+$ python demo.py
+```
+
+Expected output looks like:
+
+```txt
+0.027s -- Using OpenAI
+Indexing 3 messages...
+Indexed 3 messages.
+Got 26 semantic refs.
+```
+
+## "Hello world" query program
+
+### 1. Write this small program
+
+```py
+from typeagent import create_conversation
+from typeagent.transcripts.transcript import TranscriptMessage
+
+
+async def main():
+    conversation = await create_conversation("demo.db", TranscriptMessage)
+    question = "Who volunteered to do the python library?"
+    print("Q:", question)
+    answer = await conversation.query(question)
+    print("A:", answer)
+
+
+if __name__ == "__main__":
+    import asyncio
+    asyncio.run(main())
+```
+
+### 2. Set up your environment like above
+
+### 3. Run your program
+
+```sh
+$ python query.py
+```
+
+Expected output looks like:
+
+```txt
+0.019s -- Using OpenAI
+Q: Who volunteered to do the python library?
+A: Guido volunteered to do the Python library.
+```
+
+## Next steps
+
+You can study the full documentation for `create_conversation()`
+and `conversation.query()` in [High-level API](high-level-api.md).
+
+You can also study the source code at the
+[typeagent-py repo](https://github.com/microsoft/typeagent-py).
diff --git a/docs/gmail.md b/docs/gmail.md
@@ -0,0 +1,8 @@
+# Extracting GMail Messages
+
+There's a helper script in the repo under `gmail/`.
+It requires setting up and creating a Google API project.
+Until we have time to write this up, your best bet is to
+ask your favorite search engine or LLM-based chat bot for help.
+
+More TBD.
diff --git a/docs/high-level-api.md b/docs/high-level-api.md
@@ -0,0 +1,111 @@
+# High-level API
+
+NOTE: When an argument's default is given as `[]`, this is a shorthand
+for a dynamically assigned default value on each call. We don't mean
+the literal meaning of this notation in Python, which would imply
+that all calls would share a single empty list object as their default.
+
+## Classes
+
+### Message classes
+
+#### `ConversationMessage`
+
+`typeagent.knowpro.universal_message.ConversationMessage`
+
+Constructor and fields:
+
+```py
+class ConversationMessage(
+    text_chunks: list[str],  # Text of the message, 1 or more chunks
+    tags: list[str] = [],  # Optional tags
+    timestamp: str | None = None,  # ISO timestamp in UTC with 'z' suffix
+    metadata: ConversationMessageMeta,  # See below 
+)
+```
+
+- Only `text_chunks` is required.
+- Tags are arbitrary pieces of information attached to a message
+  that will be indexed; e.g. `["sketch", "pet shop"]`
+- If present, the timestamp must be of the form `2025-10-14T09:03:21z`.
+
+#### `ConversationMessageMeta`
+
+`typeagent.knowpro.universal_message.ConversationMessageMeta`
+
+Constructor and fields:
+
+```py
+class ConversationMessageMeta(
+    speaker: str | None = None,  # Optional entity who sent the message
+    recipients: list[str] = [],  # Optional entities to whom the message was sent
+)
+```
+
+This class represents the metadata for a given `ConversationMessage`.
+
+#### `TranscriptMessage` and `TranscriptMessageMeta`
+
+`typeagent.transcripts.transcript.TranscriptMessage`
+`typeagent.transcripts.transcript.TranscriptMessageMeta`
+
+These are simple aliases for `ConversationMessage` and
+`ConversationMessageMeta`, respectively.
+
+### Conversation classes
+
+#### `ConversationBase`
+
+`typeagent.knowpro.factory.ConversationBase`
+
+Represents a conversation, which holds ingested messages and the
+extracted and indexed knowledge thereof.
+
+It is constructed by calling the factory function
+`typeagent.create_conversation` described below.
+
+It has one public method:
+
+- `query`
+  ```py
+  async def query(
+      question: str,
+      # Other parameters are not public
+  ) -> str
+  ```
+
+  Tries to answer the question using (only) the indexed messages.
+  If no answer is found, the returned string starts with
+  `"No answer found:"`.
+
+## Functions
+
+There is currently only one public function.
+
+#### Factory function
+
+- `create_conversation`
+  ```py
+  async def create_conversation(
+      dbname: str | None,
+      message_type: type,
+      name: str = "",
+      tags: list[str] | None = None,
+      settings: ConversationSettings | None = None,
+  ) -> ConversationBase
+  ```
+
+  - Constructs a conversation object.
+  - The required `dbname` argument specifies the SQLite3 database
+    name (e.g. `test.db`). If explicitly set to `None` the data is
+    stored in RAM and will not persist when the process exits.
+  - The required `message_type` is normally `TranscriptMessage`
+    or `ConversationMessage` (there are other possibilities too,
+    as yet left undocumented).
+  - The optional `name` specifies the conversation name, which
+    may be used in diagnostics.
+  - `tags` gives tags (like `ConversationMessage.tags`) for the whole
+    conversation.
+  - `settings` provides overrides for various aspects of the knowledge
+    extraction and indexing process. Its exact usage is currently left
+    as an exercise for the reader.
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# How to Reproduce the Demos

		This will be revealed after [PyBay 2025](https://pybay.org/).