Skip to content

Add OKP module#33

Merged
tisnik merged 1 commit intolightspeed-core:mainfrom
umago:okp
Aug 19, 2025
Merged

Add OKP module#33
tisnik merged 1 commit intolightspeed-core:mainfrom
umago:okp

Conversation

@umago
Copy link
Copy Markdown
Contributor

@umago umago commented Aug 14, 2025

Description

This patch is adding an OKP module with functions to help processing OKP files.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Add OKP metadata ingestion to extract document titles and reference URLs and allow filtering OKP content by related products/projects.
  • Tests

    • Add unit tests for metadata parsing, relevance filtering, and URL/title extraction (using mocks).
  • Chores

    • Update .gitignore to ignore test result outputs.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Aug 14, 2025

Walkthrough

Adds a new OKP metadata processing module to parse TOML frontmatter from .md files, filter files by project and presence of URL/title, provide an OKPMetadataProcessor, add unit tests for these utilities, and update .gitignore to ignore tests/test_results.

Changes

Cohort / File(s) Summary
Ignore updates
/.gitignore
Add ignore rule for tests/test_results.
OKP metadata processing
src/lightspeed_rag_content/okp.py
New module: extract TOML metadata block from OKP Markdown (+++ fenced), parse to dict, check for title and extra.reference_url, test relation to projects via portal_product_names, iterate/yield matching files, and provide OKPMetadataProcessor with url_function and get_file_title.
OKP tests
tests/test_okp.py
New unit tests covering parse_metadata, metadata_has_url_and_title, is_file_related_to_projects, yield_files_related_to_projects, and OKPMetadataProcessor methods; use mocking for filesystem and parsing.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant FS as Filesystem
  participant OKP as OKP Utils

  Caller->>FS: glob("*.md")
  FS-->>Caller: list of files
  loop for each file
    Caller->>OKP: parse_metadata(file)
    OKP-->>Caller: metadata or raise
    Caller->>OKP: metadata_has_url_and_title(metadata)
    OKP-->>Caller: true/false
    Caller->>OKP: is_file_related_to_projects(metadata, projects)
    OKP-->>Caller: true/false
    alt both true
      Caller-->>Caller: yield file
    else
      Caller-->>Caller: skip file (log warning if parse succeeded)
    end
  end
Loading
sequenceDiagram
  participant Client
  participant OKP as OKPMetadataProcessor
  participant Parser as parse_metadata

  Client->>OKP: url_function(file_path)
  OKP->>Parser: parse_metadata(file_path)
  Parser-->>OKP: metadata dict
  OKP-->>Client: metadata.extra.reference_url

  Client->>OKP: get_file_title(file_path)
  OKP->>Parser: parse_metadata(file_path)
  Parser-->>OKP: metadata dict
  OKP-->>Client: metadata.title
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Poem

I nibble through TOML with whiskered delight,
Parsing plus-plus fences by moonlit night.
Hop, skip, and filter—projects aligned—
URLs and titles neatly combined.
Tests burrow deep; results out of sight. 🥕✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4cc0a11 and 8d29bf9.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • src/lightspeed_rag_content/okp.py (1 hunks)
  • tests/test_okp.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • .gitignore
  • src/lightspeed_rag_content/okp.py
  • tests/test_okp.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: mypy
  • GitHub Check: build-and-push-dev
  • GitHub Check: Pylinter
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
.gitignore (1)

53-53: Consider making the ignored path explicitly a directory

If this is meant to be a directory for test artifacts, prefer a trailing slash to make intent explicit and avoid accidental matches with a file named tests/test_results.

-tests/test_results
+tests/test_results/
src/lightspeed_rag_content/okp.py (5)

15-18: Minor readability: avoid shadowing parameters during normalization

Reassigning projects can be confusing. Use new names for the normalized lists.

-    # Lowercase both lists
-    product_names = [p.lower() for p in product_names]
-    projects = [p.lower() for p in projects]
-
-    # Check if any project is in the product names
-    return any(p in pn for p in projects for pn in product_names)
+    # Lowercase both lists
+    product_names_lc = [p.lower() for p in product_names]
+    projects_lc = [p.lower() for p in projects]
+
+    # Check if any project is in the product names
+    return any(p in pn for p in projects_lc for pn in product_names_lc)

Also applies to: 19-21


23-30: Title/URL presence check looks good

Logic correctly verifies both presence and non-empty title. Consider a future enhancement to be more defensive by using metadata.get("title", "") to avoid potential KeyError if upstream callers misuse the function, but current guard is fine.


45-57: Anchor metadata regex to the front matter block

Anchoring to the beginning of the file/front-matter guards against accidentally matching a +++ sequence in the body. Inline flags keep it compact.

-    # Extract everything between the +++ markers
-    match = re.search(rb"\+{3,}\s*(.*?)\s*\+{3,}", content, re.S)
+    # Extract everything between the +++ front-matter markers (top-of-file)
+    match = re.search(rb"(?ms)^\s*\+{3}\s*\n(.*?)\n\s*\+{3}\s*", content)

45-47: Optional: cache parse results to avoid double reads

OKPMetadataProcessor calls parse_metadata twice per file (URL + title). Light caching avoids duplicate IO without complicating call sites. Suitable if files are immutable during a run.

+from functools import lru_cache
@@
-def parse_metadata(filepath):
+@lru_cache(maxsize=512)
+def parse_metadata(filepath):

Note: If you want to support both str and Path inputs consistently in the cache key, you may normalize filepath to str at the start of parse_metadata.


62-70: Harden accessors against missing keys

If metadata lacks expected keys, raise a clear ValueError instead of a KeyError. This keeps behavior consistent with parse_metadata’s error style and plays better with callers.

     def url_function(self, file_path: str) -> str:
         """Return the URL for the OKP file."""
-        md = parse_metadata(file_path)
-        return md["extra"]["reference_url"]
+        md = parse_metadata(file_path)
+        try:
+            return md["extra"]["reference_url"]
+        except KeyError as e:
+            raise ValueError(f"reference_url not found in metadata for {file_path}") from e
@@
     def get_file_title(self, file_path: str) -> str:
         """Return the title of the OKP file."""
-        md = parse_metadata(file_path)
-        return md["title"]
+        md = parse_metadata(file_path)
+        try:
+            return md["title"]
+        except KeyError as e:
+            raise ValueError(f"title not found in metadata for {file_path}") from e
tests/test_okp.py (1)

35-42: Fix misleading docstring in negative test

Docstring says the metadata has both URL and title, but the test asserts False cases. Update for clarity.

-    def test_metadata_has_url_and_title_false(self):
-        """Test that the metadata has both URL and title."""
+    def test_metadata_has_url_and_title_false(self):
+        """Test that the metadata_missing_url_or_title returns False."""

Also applies to: 44-51

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fa72d05 and d9b1560.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • src/lightspeed_rag_content/okp.py (1 hunks)
  • tests/test_okp.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/lightspeed_rag_content/okp.py (1)
src/lightspeed_rag_content/metadata_processor.py (1)
  • MetadataProcessor (26-96)
tests/test_okp.py (1)
src/lightspeed_rag_content/okp.py (7)
  • metadata_has_url_and_title (23-29)
  • is_file_related_to_projects (11-20)
  • parse_metadata (45-56)
  • yield_files_related_to_projects (32-42)
  • OKPMetadataProcessor (59-70)
  • url_function (62-65)
  • get_file_title (67-70)
🪛 GitHub Actions: Ruff
src/lightspeed_rag_content/okp.py

[error] 20-20: Ruff: Unnecessary list comprehension (C419) in okp.py. Command 'uv tool run ruff check src' failed.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-and-push-dev
  • GitHub Check: Pylinter
  • GitHub Check: mypy
🔇 Additional comments (2)
src/lightspeed_rag_content/okp.py (1)

5-7: No tomli fallback needed — project requires Python 3.12

pyproject.toml declares requires-python = "==3.12.*" (pyproject.toml:51) and uv.lock shows the same; no tomli dependency was found in the repo scan. Keep the existing import tomllib in src/lightspeed_rag_content/okp.py (lines 5–7) as-is — a tomli fallback is unnecessary unless you lower the project's minimum Python version.

  • src/lightspeed_rag_content/okp.py (lines 5–7): import tomllib
  • pyproject.toml:51 — requires-python = "==3.12.*"
  • uv.lock:3 — requires-python = "==3.12.*"
  • No tomli package found in dependencies/search
tests/test_okp.py (1)

65-87: Good coverage of parse_metadata behavior

The test exercises front-matter parsing and nested extra fields well, including list types. Using bytes for mock_open aligns with the binary read in parse_metadata.

Also applies to: 89-117

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/lightspeed_rag_content/okp.py (3)

21-22: Good: any() now uses a generator expression (Ruff C419 satisfied)

Thanks for addressing the earlier C419. No allocation of intermediate list anymore.


45-46: Replace print with logging and catch tomllib/Unicode decode errors

Printing from libraries is discouraged. Also catch tomllib.TOMLDecodeError and UnicodeDecodeError to continue iteration on corrupt/invalid files. This aligns with prior review feedback.

-        except ValueError as e:
-            print(f"Skipping file {filepath}: {e}")
+        except (ValueError, tomllib.TOMLDecodeError, UnicodeDecodeError) as e:
+            LOG.warning("Skipping file %s: %s", filepath, e)

5-12: Fix Ruff I001: sort imports and add a module logger

Ruff flagged the import block ordering. Also, switch to a module logger for observability in downstream comments.

Apply:

-import re
-from pathlib import Path
-from typing import Any, Generator
-import tomllib
-
-from lightspeed_rag_content.metadata_processor import MetadataProcessor
+import logging
+import re
+from pathlib import Path
+from typing import Any, Generator
+
+import tomllib
+
+from lightspeed_rag_content.metadata_processor import MetadataProcessor
+
+LOG = logging.getLogger(__name__)
🧹 Nitpick comments (4)
src/lightspeed_rag_content/okp.py (4)

15-20: Defensive handling for portal_product_names

Some OKP sources may provide a single string or include stray whitespace. Normalize to a list of cleaned strings to avoid surprises.

-    product_names = metadata.get("extra", {}).get("portal_product_names", [])
-
-    # Lowercase both lists
-    product_names = [p.lower() for p in product_names]
-    projects = [p.lower() for p in projects]
+    product_names = metadata.get("extra", {}).get("portal_product_names", [])
+    if isinstance(product_names, str):
+        product_names = [product_names]
+
+    # Lowercase and strip both lists, guard for non-str entries
+    product_names = [p.lower().strip() for p in product_names if isinstance(p, str)]
+    projects = [p.lower().strip() for p in projects if isinstance(p, str)]

55-60: Anchor metadata regex to front-matter and consider pre-compilation

Anchoring reduces the chance of accidentally matching +++ blocks elsewhere in the document (e.g., code samples). Pre-compiling avoids recompilation on each call.

-    # Extract everything between the +++ markers
-    match = re.search(rb"\+{3,}\s*(.*?)\s*\+{3,}", content, re.S)
+    # Extract TOML front-matter between +++ markers at the start of lines
+    # Use a backref to ensure the same number of pluses is matched on close.
+    METADATA_RE = re.compile(rb"(?m)^(\+{3,})\s*(.*?)\s*^\1\s*$", re.S)
+    match = METADATA_RE.search(content)

Note: If OKP files sometimes put front-matter away from the top, keep your current regex, but still consider pre-compiling at module scope for efficiency.


3-3: Nit: docstring grammar

“Utility methods for processing OKP files.” reads more naturally.

-"""Utility methods processing OKP files."""
+"""Utility methods for processing OKP files."""

49-61: Optional: cache metadata parsing to avoid repeated IO

If the processor tends to request both URL and title for the same file(s), caching parse_metadata will reduce repeated file reads/parsing.

Add near imports:

from functools import lru_cache

Decorate and normalize the key:

-def parse_metadata(filepath: Path) -> dict[str, Any]:
+@lru_cache(maxsize=1024)
+def parse_metadata(filepath: Path) -> dict[str, Any]:
     """Extract metadata from the OKP file."""
-    with open(filepath, "rb") as f:
+    # Normalize key for cache consistency
+    filepath = Path(filepath)
+    with open(filepath, "rb") as f:
         content = f.read()

Note: If you adopt the earlier change to pass Path(file_path) at call sites, this cache will be effective. If you keep accepting both str and Path, normalize to Path inside the function as shown.

Also applies to: 66-74

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d9b1560 and 9fd4ea2.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • src/lightspeed_rag_content/okp.py (1 hunks)
  • tests/test_okp.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • .gitignore
  • tests/test_okp.py
🧰 Additional context used
🪛 GitHub Actions: Pyright
src/lightspeed_rag_content/okp.py

[error] 68-68: Step 'uv run pyright src' failed with Pyright error: Argument of type 'str' cannot be assigned to parameter 'filepath' of type 'Path' in function 'parse_metadata'. 'str' is not assignable to 'Path' (reportArgumentType).


[error] 73-73: Step 'uv run pyright src' failed with Pyright error: Argument of type 'str' cannot be assigned to parameter 'filepath' of type 'Path' in function 'parse_metadata'. 'str' is not assignable to 'Path' (reportArgumentType).

🪛 GitHub Actions: Ruff
src/lightspeed_rag_content/okp.py

[error] 5-10: Ruff I001: Import block is unsorted or unformatted. Organize imports. Run 'ruff --fix' to auto-fix.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Pylinter
  • GitHub Check: build-and-push-dev
  • GitHub Check: mypy

Copy link
Copy Markdown
Contributor

@Akrog Akrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some documentation is necessary.
It may be enough having an example with a brief explanation (the example could be a section of your PR for openstack lightspeed).

@umago
Copy link
Copy Markdown
Contributor Author

umago commented Aug 15, 2025

@Akrog thanks for the review! Regarding documenting, I will do it in the docstrings otherwise leaving it in the PR or in the commit message may not be easily accessible for people using this module.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/lightspeed_rag_content/okp.py (2)

97-109: Broaden exception handling so one bad file doesn’t halt iteration

Only ValueError is caught; TOML parse errors or bad encodings will currently bubble and stop the scan. Catch tomllib.TOMLDecodeError and UnicodeDecodeError as well, and keep logging.

Apply this diff:

-        except ValueError as e:
-            LOG.warning("Skipping OKP file %s: %s", filepath, e)
+        except (ValueError, tomllib.TOMLDecodeError, UnicodeDecodeError) as e:
+            LOG.warning("Skipping OKP file %s: %s", filepath, e)

64-67: Bug: returns a string instead of bool; also lacks type checks for URL/title

The current expression returns the stripped title string when truthy (not a boolean), and will raise if title is non-string. Tighten validation and ensure a boolean is returned.

Apply this diff:

-    return (
-        "reference_url" in metadata.get("extra", {})
-        and metadata.get("title", "").strip()
-    )
+    url = metadata.get("extra", {}).get("reference_url")
+    title = metadata.get("title")
+    return (
+        isinstance(url, str) and url.strip() != "" and
+        isinstance(title, str) and title.strip() != ""
+    )
🧹 Nitpick comments (3)
src/lightspeed_rag_content/okp.py (3)

48-53: Defensively normalize inputs in is_file_related_to_projects (avoid type surprises)

portal_product_names can be a string or a non-list value; also avoid reassigning the parameter name for readability. Normalize both inputs to lists of lowercase strings before matching.

Apply this diff:

-    product_names = metadata.get("extra", {}).get("portal_product_names", [])
-    # Lowercase both lists
-    product_names = [p.lower() for p in product_names]
-    projects = [p.lower() for p in projects]
-    return any(p in pn for p in projects for pn in product_names)
+    raw_product_names = metadata.get("extra", {}).get("portal_product_names", [])
+    # Normalize product names
+    if isinstance(raw_product_names, str):
+        product_names = [raw_product_names.lower()]
+    elif isinstance(raw_product_names, (list, tuple, set)):
+        product_names = [p.lower() for p in raw_product_names if isinstance(p, str)]
+    else:
+        product_names = []
+    # Normalize project terms
+    project_terms = [p.lower() for p in projects if isinstance(p, str) and p.strip()]
+    return any(p in pn for p in project_terms for pn in product_names)

132-138: Anchor front-matter regex to file start to avoid false positives

As written, it can match +++ blocks found later in the body. Anchoring improves correctness for standard TOML front matter at the file start.

Apply this diff:

-    # Extract everything between the +++ markers
-    match = re.search(rb"\+{3,}\s*(.*?)\s*\+{3,}", content, re.S)
+    # Extract front matter between +++ markers at the start of the file
+    match = re.search(rb"\A\+{3}\s*\n(.*?)\n\+{3}", content, re.S)

144-152: Avoid double parsing and consider resilience to missing keys

Both url_function and get_file_title parse the file separately, which duplicates IO/parse work when used back-to-back (e.g., in populate). Optionally cache or parse once per file operation, or override populate to read metadata once and derive both fields. Also, these assume keys always exist; if used outside your filtering pipeline, a KeyError will surface.

  • If you want, I can propose an override of MetadataProcessor.populate in this class to parse once and extract both fields.
  • Verify call sites guarantee presence of extra.reference_url and title before invoking these accessors.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 14b94c5 and 4cc0a11.

📒 Files selected for processing (3)
  • .gitignore (1 hunks)
  • src/lightspeed_rag_content/okp.py (1 hunks)
  • tests/test_okp.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_okp.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/lightspeed_rag_content/okp.py (1)
src/lightspeed_rag_content/metadata_processor.py (1)
  • MetadataProcessor (26-96)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-and-push-dev
  • GitHub Check: mypy
  • GitHub Check: Pylinter
🔇 Additional comments (1)
src/lightspeed_rag_content/okp.py (1)

1-26: Module scaffold and logging look solid

License header, imports, and module-level logger are set up correctly. Good foundation.

@umago
Copy link
Copy Markdown
Contributor Author

umago commented Aug 15, 2025

For those interested, this is how we are using it for OpenStack: openstack-lightspeed/rag-content#58

Copy link
Copy Markdown
Contributor

@lpiwowar lpiwowar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍 except for one small nit. I feel like the type: ignore comments should be last resort if we commit to the type checking.

This patch is adding an OKP module with functions to help processing OKP
files.

Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
Copy link
Copy Markdown
Contributor

@lpiwowar lpiwowar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!:) 👍

@umago
Copy link
Copy Markdown
Contributor Author

umago commented Aug 18, 2025

@tisnik whenever u have some free time. Just some common methods that I think other lightspeed teams might benefit from when handling OKP files (Here is the OpenStack usage for reference openstack-lightspeed/rag-content#58)

Copy link
Copy Markdown
Collaborator

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@tisnik tisnik merged commit 754e6f9 into lightspeed-core:main Aug 19, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants