Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connector for Google Cloud Storage #301

Closed
cragwolfe opened this issue Feb 27, 2023 · 4 comments · Fixed by #746
Closed

Connector for Google Cloud Storage #301

cragwolfe opened this issue Feb 27, 2023 · 4 comments · Fixed by #746
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@cragwolfe
Copy link
Contributor

Create a data connector that pulls documents from Google Cloud Storage, stores them locally (at least temporarily for processing), and runs them through unstructured.partition.auto.

See Adding Data Connectors for details on how to get started. Make sure to include a link to this issue when submitting a PR.

Definition of Done

  • The checklist has been completed.
  • The connector is able to process a single document.
  • The connector is able to process all documents in a Google Cloud Storage folder, recursively.
  • For now, it is OK to process only doc types that unstructured.partition.auto is capable of processing. Google Cloud Storage documents should be converted to PDF or Word Doc for processing (unless there is a better way).
  • Bonus points: the ability to filter by document type.
@alvarobartt
Copy link
Contributor

Thanks @cragwolfe I'll tackle this!

@cragwolfe
Copy link
Contributor Author

Sounds good! Thanks @alvarobartt !

@alvarobartt
Copy link
Contributor

Hi @benjats07, can you provide me a public GCS Bucket to test this integration too?

@alvarobartt
Copy link
Contributor

Hi @cragwolfe so for this integration I think anyone could do it taking both AzureBlobStorageConnector and S3Connector as a reference, so feel free to label it as good first issue 🤗 In case no one tackles this during the upcoming week I'll do so as soon as we have a GCS Bucket to test it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants