Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix + small addition to SimpleDirectoryReader #213

Merged
merged 5 commits into from Jan 12, 2023

Conversation

emptycrown
Copy link
Contributor

Previously, the recursive attribute was not being used. The reader always recursively read subdirectories even though the attribute is False by default. This is fixed.

I also added a new attribute to limit the number of files added to the reader when the recursion can go deep. The recursion still finishes (not a worry) but it avoids having too many Documents created. I switched the resulting new_input_files array to reflect BFS instead of DFS and so the file limit prioritizes top-level files.

Copy link
Collaborator

@jerryjliu jerryjliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh woah nice catch!

would you mind adding a test for this (e.g. testing that if recursive is False, we don't go deep + testing num_files_limit) in tests/readers/test_file.py? Would be good to catch stuff like this in the future.

@jerryjliu
Copy link
Collaborator

(btw i can also do it, will have some time later today)

@emptycrown
Copy link
Contributor Author

Yep I'll be working on some stuff later today and I'll also write the tests

@jerryjliu
Copy link
Collaborator

@emptycrown oops i got excited and already added tests, but please take a look!

@emptycrown
Copy link
Contributor Author

Yep, lgtm!

@jerryjliu jerryjliu merged commit 1cc7aea into run-llama:main Jan 12, 2023
viveksilimkhan1 pushed a commit to viveksilimkhan1/llama_index that referenced this pull request Oct 30, 2023
* Fix regex -> re import. (run-llama#205)

* Add 'gpu' marker. (run-llama#208)

* Optional authentation verification at init time (run-llama#206)

* Add verify_auth param.

* Add _verify_auth() for Cohere and Anthropic.

* Make auth check mandatory again.

* Remove redundant PR.

* Fix OpenLLaMA model names (run-llama#209)

* Update PR template. (run-llama#207)

* Update OpenLLaMA model names.

* fix model name in documentation (run-llama#210)

* Bump version to 0.4.1. (run-llama#211)

---------

Co-authored-by: Sofie Van Landeghem <svlandeg@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants