Skip to content

AP-369 Doc processing#10

Merged
jason-raitz merged 14 commits intomainfrom
jcr-doc-processing
Aug 6, 2025
Merged

AP-369 Doc processing#10
jason-raitz merged 14 commits intomainfrom
jcr-doc-processing

Conversation

@jason-raitz
Copy link
Copy Markdown
Contributor

  • adds new etl utility doc-processing
  • tests for doc_proc
  • adds .DS_Store to .gitignore
  • updates pyproject.toml
  • updates readme with local testing and linting instructions

 - adds new etl utility doc-processing
 - tests for doc_proc
 - adds .DS_Store to .gitignore
Comment thread willa/etl/doc_proc.py Outdated
Copy link
Copy Markdown
Member

@awilfox awilfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. There's a few nitpicks and a small improvement.

Comment thread willa/etl/__init__.py Outdated
Comment thread tests/etl/test_doc_proc.py
Comment thread willa/etl/doc_proc.py Outdated
Comment thread willa/etl/doc_proc.py Outdated
Comment thread README.rst
Comment thread README.rst Outdated
Comment thread .gitignore Outdated
jason-raitz and others added 9 commits August 6, 2025 16:53
Co-authored-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Co-authored-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Co-authored-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Co-authored-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Co-authored-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Co-authored-by: A. Wilcox <AWilcox@Wilcox-Tech.com>
Copy link
Copy Markdown
Member

@awilfox awilfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs two lines between methods in willa/etl/doc_proc.py then ready to merge.

@jason-raitz jason-raitz merged commit eef344b into main Aug 6, 2025
1 check passed
@jason-raitz jason-raitz deleted the jcr-doc-processing branch August 6, 2025 21:18
awilfox added a commit that referenced this pull request Aug 6, 2025
My suggestion in the AP-369 PR wasn't exactly correct.  We need to
return the actual split documents, not a list of lists containing them.

Fixes: eef344b ("AP-369 Doc processing (#10)")
awilfox added a commit that referenced this pull request Aug 7, 2025
My suggestion in the AP-369 PR wasn't exactly correct.  We need to
return the actual split documents, not a list of lists containing them.

Fixes: eef344b ("AP-369 Doc processing (#10)")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants