Add OKP module by umago · Pull Request #33 · lightspeed-core/rag-content

umago · 2025-08-14T11:21:12Z

Description

This patch is adding an OKP module with functions to help processing OKP files.

Type of change

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Add OKP metadata ingestion to extract document titles and reference URLs and allow filtering OKP content by related products/projects.
Tests
- Add unit tests for metadata parsing, relevance filtering, and URL/title extraction (using mocks).
Chores
- Update .gitignore to ignore test result outputs.

coderabbitai · 2025-08-14T11:21:19Z

Walkthrough

Adds a new OKP metadata processing module to parse TOML frontmatter from .md files, filter files by project and presence of URL/title, provide an OKPMetadataProcessor, add unit tests for these utilities, and update .gitignore to ignore tests/test_results.

Changes

Cohort / File(s)	Summary
Ignore updates `/.gitignore`	Add ignore rule for `tests/test_results`.
OKP metadata processing `src/lightspeed_rag_content/okp.py`	New module: extract TOML metadata block from OKP Markdown (`+++` fenced), parse to dict, check for title and `extra.reference_url`, test relation to projects via `portal_product_names`, iterate/yield matching files, and provide `OKPMetadataProcessor` with `url_function` and `get_file_title`.
OKP tests `tests/test_okp.py`	New unit tests covering `parse_metadata`, `metadata_has_url_and_title`, `is_file_related_to_projects`, `yield_files_related_to_projects`, and `OKPMetadataProcessor` methods; use mocking for filesystem and parsing.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant FS as Filesystem
  participant OKP as OKP Utils

  Caller->>FS: glob("*.md")
  FS-->>Caller: list of files
  loop for each file
    Caller->>OKP: parse_metadata(file)
    OKP-->>Caller: metadata or raise
    Caller->>OKP: metadata_has_url_and_title(metadata)
    OKP-->>Caller: true/false
    Caller->>OKP: is_file_related_to_projects(metadata, projects)
    OKP-->>Caller: true/false
    alt both true
      Caller-->>Caller: yield file
    else
      Caller-->>Caller: skip file (log warning if parse succeeded)
    end
  end

sequenceDiagram
  participant Client
  participant OKP as OKPMetadataProcessor
  participant Parser as parse_metadata

  Client->>OKP: url_function(file_path)
  OKP->>Parser: parse_metadata(file_path)
  Parser-->>OKP: metadata dict
  OKP-->>Client: metadata.extra.reference_url

  Client->>OKP: get_file_title(file_path)
  OKP->>Parser: parse_metadata(file_path)
  Parser-->>OKP: metadata dict
  OKP-->>Client: metadata.title

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Poem

I nibble through TOML with whiskered delight,
Parsing plus-plus fences by moonlit night.
Hop, skip, and filter—projects aligned—
URLs and titles neatly combined.
Tests burrow deep; results out of sight. 🥕✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4cc0a11 and 8d29bf9.

📒 Files selected for processing (3)

.gitignore (1 hunks)
src/lightspeed_rag_content/okp.py (1 hunks)
tests/test_okp.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

.gitignore
src/lightspeed_rag_content/okp.py
tests/test_okp.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: mypy
GitHub Check: build-and-push-dev
GitHub Check: Pylinter

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (7)

.gitignore (1)
53-53: Consider making the ignored path explicitly a directory

If this is meant to be a directory for test artifacts, prefer a trailing slash to make intent explicit and avoid accidental matches with a file named tests/test_results.
-tests/test_results
+tests/test_results/
src/lightspeed_rag_content/okp.py (5)
15-18: Minor readability: avoid shadowing parameters during normalization

Reassigning projects can be confusing. Use new names for the normalized lists.
-    # Lowercase both lists
-    product_names = [p.lower() for p in product_names]
-    projects = [p.lower() for p in projects]
-
-    # Check if any project is in the product names
-    return any(p in pn for p in projects for pn in product_names)
+    # Lowercase both lists
+    product_names_lc = [p.lower() for p in product_names]
+    projects_lc = [p.lower() for p in projects]
+
+    # Check if any project is in the product names
+    return any(p in pn for p in projects_lc for pn in product_names_lc)
Also applies to: 19-21

23-30: Title/URL presence check looks good

Logic correctly verifies both presence and non-empty title. Consider a future enhancement to be more defensive by using metadata.get("title", "") to avoid potential KeyError if upstream callers misuse the function, but current guard is fine.

45-57: Anchor metadata regex to the front matter block

Anchoring to the beginning of the file/front-matter guards against accidentally matching a +++ sequence in the body. Inline flags keep it compact.
-    # Extract everything between the +++ markers
-    match = re.search(rb"\+{3,}\s*(.*?)\s*\+{3,}", content, re.S)
+    # Extract everything between the +++ front-matter markers (top-of-file)
+    match = re.search(rb"(?ms)^\s*\+{3}\s*\n(.*?)\n\s*\+{3}\s*", content)
45-47: Optional: cache parse results to avoid double reads

OKPMetadataProcessor calls parse_metadata twice per file (URL + title). Light caching avoids duplicate IO without complicating call sites. Suitable if files are immutable during a run.
+from functools import lru_cache
@@
-def parse_metadata(filepath):
+@lru_cache(maxsize=512)
+def parse_metadata(filepath):
Note: If you want to support both str and Path inputs consistently in the cache key, you may normalize filepath to str at the start of parse_metadata.

62-70: Harden accessors against missing keys

If metadata lacks expected keys, raise a clear ValueError instead of a KeyError. This keeps behavior consistent with parse_metadata’s error style and plays better with callers.
     def url_function(self, file_path: str) -> str:
         """Return the URL for the OKP file."""
-        md = parse_metadata(file_path)
-        return md["extra"]["reference_url"]
+        md = parse_metadata(file_path)
+        try:
+            return md["extra"]["reference_url"]
+        except KeyError as e:
+            raise ValueError(f"reference_url not found in metadata for {file_path}") from e
@@
     def get_file_title(self, file_path: str) -> str:
         """Return the title of the OKP file."""
-        md = parse_metadata(file_path)
-        return md["title"]
+        md = parse_metadata(file_path)
+        try:
+            return md["title"]
+        except KeyError as e:
+            raise ValueError(f"title not found in metadata for {file_path}") from e
tests/test_okp.py (1)
35-42: Fix misleading docstring in negative test

Docstring says the metadata has both URL and title, but the test asserts False cases. Update for clarity.
-    def test_metadata_has_url_and_title_false(self):
-        """Test that the metadata has both URL and title."""
+    def test_metadata_has_url_and_title_false(self):
+        """Test that the metadata_missing_url_or_title returns False."""
Also applies to: 44-51

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fa72d05 and d9b1560.

📒 Files selected for processing (3)

.gitignore (1 hunks)
src/lightspeed_rag_content/okp.py (1 hunks)
tests/test_okp.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

src/lightspeed_rag_content/okp.py (1)

src/lightspeed_rag_content/metadata_processor.py (1)

MetadataProcessor (26-96)

tests/test_okp.py (1)

src/lightspeed_rag_content/okp.py (7)

metadata_has_url_and_title (23-29)

is_file_related_to_projects (11-20)

parse_metadata (45-56)

yield_files_related_to_projects (32-42)

OKPMetadataProcessor (59-70)

url_function (62-65)

get_file_title (67-70)

🪛 GitHub Actions: Ruff

src/lightspeed_rag_content/okp.py

[error] 20-20: Ruff: Unnecessary list comprehension (C419) in okp.py. Command 'uv tool run ruff check src' failed.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-and-push-dev
GitHub Check: Pylinter
GitHub Check: mypy

🔇 Additional comments (2)

src/lightspeed_rag_content/okp.py (1)

5-7: No tomli fallback needed — project requires Python 3.12

pyproject.toml declares requires-python = "==3.12.*" (pyproject.toml:51) and uv.lock shows the same; no tomli dependency was found in the repo scan. Keep the existing import tomllib in src/lightspeed_rag_content/okp.py (lines 5–7) as-is — a tomli fallback is unnecessary unless you lower the project's minimum Python version.

src/lightspeed_rag_content/okp.py (lines 5–7): import tomllib

pyproject.toml:51 — requires-python = "==3.12.*"

uv.lock:3 — requires-python = "==3.12.*"

No tomli package found in dependencies/search

tests/test_okp.py (1)

65-87: Good coverage of parse_metadata behavior

The test exercises front-matter parsing and nested extra fields well, including list types. Using bytes for mock_open aligns with the binary read in parse_metadata.

Also applies to: 89-117

src/lightspeed_rag_content/okp.py

tests/test_okp.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (3)

src/lightspeed_rag_content/okp.py (3)
21-22: Good: any() now uses a generator expression (Ruff C419 satisfied)

Thanks for addressing the earlier C419. No allocation of intermediate list anymore.

45-46: Replace print with logging and catch tomllib/Unicode decode errors

Printing from libraries is discouraged. Also catch tomllib.TOMLDecodeError and UnicodeDecodeError to continue iteration on corrupt/invalid files. This aligns with prior review feedback.
-        except ValueError as e:
-            print(f"Skipping file {filepath}: {e}")
+        except (ValueError, tomllib.TOMLDecodeError, UnicodeDecodeError) as e:
+            LOG.warning("Skipping file %s: %s", filepath, e)
5-12: Fix Ruff I001: sort imports and add a module logger

Ruff flagged the import block ordering. Also, switch to a module logger for observability in downstream comments.

Apply:
-import re
-from pathlib import Path
-from typing import Any, Generator
-import tomllib
-
-from lightspeed_rag_content.metadata_processor import MetadataProcessor
+import logging
+import re
+from pathlib import Path
+from typing import Any, Generator
+
+import tomllib
+
+from lightspeed_rag_content.metadata_processor import MetadataProcessor
+
+LOG = logging.getLogger(__name__)

🧹 Nitpick comments (4)

src/lightspeed_rag_content/okp.py (4)
15-20: Defensive handling for portal_product_names

Some OKP sources may provide a single string or include stray whitespace. Normalize to a list of cleaned strings to avoid surprises.
-    product_names = metadata.get("extra", {}).get("portal_product_names", [])
-
-    # Lowercase both lists
-    product_names = [p.lower() for p in product_names]
-    projects = [p.lower() for p in projects]
+    product_names = metadata.get("extra", {}).get("portal_product_names", [])
+    if isinstance(product_names, str):
+        product_names = [product_names]
+
+    # Lowercase and strip both lists, guard for non-str entries
+    product_names = [p.lower().strip() for p in product_names if isinstance(p, str)]
+    projects = [p.lower().strip() for p in projects if isinstance(p, str)]
55-60: Anchor metadata regex to front-matter and consider pre-compilation

Anchoring reduces the chance of accidentally matching +++ blocks elsewhere in the document (e.g., code samples). Pre-compiling avoids recompilation on each call.
-    # Extract everything between the +++ markers
-    match = re.search(rb"\+{3,}\s*(.*?)\s*\+{3,}", content, re.S)
+    # Extract TOML front-matter between +++ markers at the start of lines
+    # Use a backref to ensure the same number of pluses is matched on close.
+    METADATA_RE = re.compile(rb"(?m)^(\+{3,})\s*(.*?)\s*^\1\s*$", re.S)
+    match = METADATA_RE.search(content)
Note: If OKP files sometimes put front-matter away from the top, keep your current regex, but still consider pre-compiling at module scope for efficiency.

3-3: Nit: docstring grammar

“Utility methods for processing OKP files.” reads more naturally.
-"""Utility methods processing OKP files."""
+"""Utility methods for processing OKP files."""
49-61: Optional: cache metadata parsing to avoid repeated IO

If the processor tends to request both URL and title for the same file(s), caching parse_metadata will reduce repeated file reads/parsing.

Add near imports:
from functools import lru_cache
Decorate and normalize the key:
-def parse_metadata(filepath: Path) -> dict[str, Any]:
+@lru_cache(maxsize=1024)
+def parse_metadata(filepath: Path) -> dict[str, Any]:
     """Extract metadata from the OKP file."""
-    with open(filepath, "rb") as f:
+    # Normalize key for cache consistency
+    filepath = Path(filepath)
+    with open(filepath, "rb") as f:
         content = f.read()
Note: If you adopt the earlier change to pass Path(file_path) at call sites, this cache will be effective. If you keep accepting both str and Path, normalize to Path inside the function as shown.

Also applies to: 66-74

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d9b1560 and 9fd4ea2.

📒 Files selected for processing (3)

.gitignore (1 hunks)
src/lightspeed_rag_content/okp.py (1 hunks)
tests/test_okp.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

.gitignore
tests/test_okp.py

🧰 Additional context used

🪛 GitHub Actions: Pyright

src/lightspeed_rag_content/okp.py

[error] 68-68: Step 'uv run pyright src' failed with Pyright error: Argument of type 'str' cannot be assigned to parameter 'filepath' of type 'Path' in function 'parse_metadata'. 'str' is not assignable to 'Path' (reportArgumentType).

[error] 73-73: Step 'uv run pyright src' failed with Pyright error: Argument of type 'str' cannot be assigned to parameter 'filepath' of type 'Path' in function 'parse_metadata'. 'str' is not assignable to 'Path' (reportArgumentType).

🪛 GitHub Actions: Ruff

src/lightspeed_rag_content/okp.py

[error] 5-10: Ruff I001: Import block is unsorted or unformatted. Organize imports. Run 'ruff --fix' to auto-fix.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Pylinter
GitHub Check: build-and-push-dev
GitHub Check: mypy

src/lightspeed_rag_content/okp.py

Akrog

I think some documentation is necessary.
It may be enough having an example with a brief explanation (the example could be a section of your PR for openstack lightspeed).

src/lightspeed_rag_content/okp.py

umago · 2025-08-15T09:06:11Z

@Akrog thanks for the review! Regarding documenting, I will do it in the docstrings otherwise leaving it in the PR or in the commit message may not be easily accessible for people using this module.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

src/lightspeed_rag_content/okp.py (2)
97-109: Broaden exception handling so one bad file doesn’t halt iteration

Only ValueError is caught; TOML parse errors or bad encodings will currently bubble and stop the scan. Catch tomllib.TOMLDecodeError and UnicodeDecodeError as well, and keep logging.

Apply this diff:
-        except ValueError as e:
-            LOG.warning("Skipping OKP file %s: %s", filepath, e)
+        except (ValueError, tomllib.TOMLDecodeError, UnicodeDecodeError) as e:
+            LOG.warning("Skipping OKP file %s: %s", filepath, e)
64-67: Bug: returns a string instead of bool; also lacks type checks for URL/title

The current expression returns the stripped title string when truthy (not a boolean), and will raise if title is non-string. Tighten validation and ensure a boolean is returned.

Apply this diff:
-    return (
-        "reference_url" in metadata.get("extra", {})
-        and metadata.get("title", "").strip()
-    )
+    url = metadata.get("extra", {}).get("reference_url")
+    title = metadata.get("title")
+    return (
+        isinstance(url, str) and url.strip() != "" and
+        isinstance(title, str) and title.strip() != ""
+    )

🧹 Nitpick comments (3)

src/lightspeed_rag_content/okp.py (3)
48-53: Defensively normalize inputs in is_file_related_to_projects (avoid type surprises)

portal_product_names can be a string or a non-list value; also avoid reassigning the parameter name for readability. Normalize both inputs to lists of lowercase strings before matching.

Apply this diff:
-    product_names = metadata.get("extra", {}).get("portal_product_names", [])
-    # Lowercase both lists
-    product_names = [p.lower() for p in product_names]
-    projects = [p.lower() for p in projects]
-    return any(p in pn for p in projects for pn in product_names)
+    raw_product_names = metadata.get("extra", {}).get("portal_product_names", [])
+    # Normalize product names
+    if isinstance(raw_product_names, str):
+        product_names = [raw_product_names.lower()]
+    elif isinstance(raw_product_names, (list, tuple, set)):
+        product_names = [p.lower() for p in raw_product_names if isinstance(p, str)]
+    else:
+        product_names = []
+    # Normalize project terms
+    project_terms = [p.lower() for p in projects if isinstance(p, str) and p.strip()]
+    return any(p in pn for p in project_terms for pn in product_names)
132-138: Anchor front-matter regex to file start to avoid false positives

As written, it can match +++ blocks found later in the body. Anchoring improves correctness for standard TOML front matter at the file start.

Apply this diff:
-    # Extract everything between the +++ markers
-    match = re.search(rb"\+{3,}\s*(.*?)\s*\+{3,}", content, re.S)
+    # Extract front matter between +++ markers at the start of the file
+    match = re.search(rb"\A\+{3}\s*\n(.*?)\n\+{3}", content, re.S)
144-152: Avoid double parsing and consider resilience to missing keys

Both url_function and get_file_title parse the file separately, which duplicates IO/parse work when used back-to-back (e.g., in populate). Optionally cache or parse once per file operation, or override populate to read metadata once and derive both fields. Also, these assume keys always exist; if used outside your filtering pipeline, a KeyError will surface.

If you want, I can propose an override of MetadataProcessor.populate in this class to parse once and extract both fields.

Verify call sites guarantee presence of extra.reference_url and title before invoking these accessors.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 14b94c5 and 4cc0a11.

📒 Files selected for processing (3)

.gitignore (1 hunks)
src/lightspeed_rag_content/okp.py (1 hunks)
tests/test_okp.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

.gitignore

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_okp.py

🧰 Additional context used

🧬 Code Graph Analysis (1)

src/lightspeed_rag_content/okp.py (1)

src/lightspeed_rag_content/metadata_processor.py (1)

MetadataProcessor (26-96)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: build-and-push-dev
GitHub Check: mypy
GitHub Check: Pylinter

🔇 Additional comments (1)

src/lightspeed_rag_content/okp.py (1)

1-26: Module scaffold and logging look solid

License header, imports, and module-level logger are set up correctly. Good foundation.

umago · 2025-08-15T10:36:46Z

For those interested, this is how we are using it for OpenStack: openstack-lightspeed/rag-content#58

lpiwowar

Looks good to me 👍 except for one small nit. I feel like the type: ignore comments should be last resort if we commit to the type checking.

src/lightspeed_rag_content/okp.py

This patch is adding an OKP module with functions to help processing OKP files. Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>

lpiwowar

LGTM!:) 👍

umago · 2025-08-18T08:17:59Z

@tisnik whenever u have some free time. Just some common methods that I think other lightspeed teams might benefit from when handling OKP files (Here is the OpenStack usage for reference openstack-lightspeed/rag-content#58)

tisnik

LGTM, thank you

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

tests/test_okp.py Outdated Show resolved Hide resolved

umago force-pushed the okp branch from d9b1560 to 9fd4ea2 Compare August 14, 2025 12:50

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

src/lightspeed_rag_content/okp.py Show resolved Hide resolved

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

umago force-pushed the okp branch 2 times, most recently from 40b5bb7 to 14b94c5 Compare August 14, 2025 14:10

umago mentioned this pull request Aug 14, 2025

Add support for OKP data ingestion openstack-lightspeed/rag-content#58

Closed

Akrog reviewed Aug 14, 2025

View reviewed changes

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

umago force-pushed the okp branch from 14b94c5 to 4cc0a11 Compare August 15, 2025 09:06

coderabbitai bot reviewed Aug 15, 2025

View reviewed changes

lpiwowar reviewed Aug 15, 2025

View reviewed changes

src/lightspeed_rag_content/okp.py Outdated Show resolved Hide resolved

Add OKP module

8d29bf9

This patch is adding an OKP module with functions to help processing OKP files. Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>

umago force-pushed the okp branch from 4cc0a11 to 8d29bf9 Compare August 15, 2025 11:01

lpiwowar approved these changes Aug 15, 2025

View reviewed changes

tisnik approved these changes Aug 18, 2025

View reviewed changes

tisnik merged commit 754e6f9 into lightspeed-core:main Aug 19, 2025
13 checks passed

Conversation

umago commented Aug 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Akrog left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

umago commented Aug 15, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

umago commented Aug 15, 2025

Uh oh!

lpiwowar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lpiwowar left a comment

Choose a reason for hiding this comment

Uh oh!

umago commented Aug 18, 2025

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

umago commented Aug 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 14, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)