Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue #5104) #5220

Merged
merged 1 commit into from May 25, 2023

Conversation

NickL77
Copy link
Contributor

@NickL77 NickL77 commented May 24, 2023

Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue #5104)

Fixes #5104

If the previous behavior of loading files that used to live in the folder, but are now trashed, you can use the load_trashed_files parameter:

loader = GoogleDriveLoader(
    folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
    recursive=False,
    load_trashed_files=True
)

As not loading trashed files should be expected behavior, should we

  1. even provide the load_trashed_files parameter?
  2. add documentation? Feels most users will stick with default behavior

Who can review?

Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

DataLoaders

Twitter: @nicholasliu77

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@hwchase17 hwchase17 merged commit f0ea093 into langchain-ai:master May 25, 2023
12 checks passed
@NickL77 NickL77 deleted the gdrive-trashed-files branch May 25, 2023 07:49
@danielchalef danielchalef mentioned this pull request Jun 5, 2023
hwchase17 pushed a commit that referenced this pull request Jun 19, 2023
# Iterate through filtered file types instead of all listed files

Fixes #6257

#4926 originally added the
functionality to filter by file type, storing the filtered files in
`_files`

#5220 removed the
functionality when adding code to filter trashed files by using the
`files` variables instead of the `_files` variable.

This PR simply adds the functionality back by using `_files` again.

#### Who can review?

@hwchase17 - project lead
@eyurtsev
Undertone0809 pushed a commit to Undertone0809/langchain that referenced this pull request Jun 19, 2023
…issue langchain-ai#5104) (langchain-ai#5220)

# Change Default GoogleDriveLoader Behavior to not Load Trashed Files
(issue langchain-ai#5104)

Fixes langchain-ai#5104

If the previous behavior of loading files that used to live in the
folder, but are now trashed, you can use the `load_trashed_files`
parameter:

```
loader = GoogleDriveLoader(
    folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
    recursive=False,
    load_trashed_files=True
)
```

As not loading trashed files should be expected behavior, should we
1. even provide the `load_trashed_files` parameter?
2. add documentation? Feels most users will stick with default behavior

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

DataLoaders
- @eyurtsev

Twitter: [@nicholasliu77](https://twitter.com/nicholasliu77)
This was referenced Jun 25, 2023
kacperlukawski pushed a commit to kacperlukawski/langchain that referenced this pull request Jun 29, 2023
…chain-ai#6258)

# Iterate through filtered file types instead of all listed files

Fixes langchain-ai#6257

langchain-ai#4926 originally added the
functionality to filter by file type, storing the filtered files in
`_files`

langchain-ai#5220 removed the
functionality when adding code to filter trashed files by using the
`files` variables instead of the `_files` variable.

This PR simply adds the functionality back by using `_files` again.

#### Who can review?

@hwchase17 - project lead
@eyurtsev
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GoogleDriveLoader seems to be pulling trashed documents from the folder
2 participants