Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Empty Org Files or Org Files with No Headings #86

Merged
merged 7 commits into from
Sep 10, 2022

Conversation

debanjum
Copy link
Member

@debanjum debanjum commented Sep 10, 2022

Main Changes

  • bf01a4f Use filename or "#+TITLE" as heading for 0th level content in org files
  • d6bd7bf Fix initializing OrgNode level to string to parse org files
  • d835467 Throw exception if no valid entries found in specified content files

Miscellaneous Improvements

  • 7df39e5 Reuse search models across pytest sessions. Merge unused pytest fixtures
  • 2dc0588 Do not normalize absolute filenames for entry links in OrgNode
  • e00bb53 Init word filter dictionary with default value as set to simplify code

Resolves #83

- Previously we were failing if no valid entries while computing
  embeddings. This was obscuring the actual issue of no valid entries
  found in the specified content files
- Throwing an exception early with clear message when no entries found
  should make clarify the issue to be fixed
- See issue #83 for details
- Parsed `level` argument passed to OrgNode during init is expected to
  be a string, not an integer
- This was resulting in app failure only when parsing org files with
  no headings, like in issue #83, as level is set to string of `*`s
  the moment a heading is found in the current file
- Set LINE, SOURCE link properties in property drawer correctly for
  content which falls under no heading
- See Issue #83 for more details
- Remove unused model_dir pytest fixture. It was only being used by
  the content_config fixture, not by any tests
- Reuse existing search models downloaded to khoj directory.
  Downloading search models for each pytest sessions seems excessive and
  slows down tests quite a bit
@debanjum debanjum force-pushed the handle-org-files-with-no-headings-or-empty branch from 11fa1d2 to 976397b Compare September 10, 2022 12:35
@debanjum debanjum merged commit 372dcd2 into master Sep 10, 2022
@debanjum debanjum deleted the handle-org-files-with-no-headings-or-empty branch September 10, 2022 12:56
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

errors with org files containing certain types of structures
1 participant