Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve dataloader error handling #182

Merged
merged 11 commits into from Dec 12, 2023

Conversation

miararoy
Copy link
Contributor

@miararoy miararoy commented Nov 19, 2023

Problem

data loading has no meaningful error handling, albeit being a sensitive part of the flow

resolves: #181

Solution

first, defining DataLoaderException that will formalize all data loading error on both the file/row level and unify them under one error, then change the exception handling to raise this error

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
  • Infrastructure change (CI configs, etc)
  • Non-code change (docs, etc)
  • None of the above: (explain here)

Test Plan

  • adding simple tests for data loading of .txt
  • changing the current mechanism of dataloader tests to also have the data as (name, df, exception) for the bad cases

note that tests should were added to txt -> df as from df to documents is handled.

@miararoy miararoy marked this pull request as ready for review November 26, 2023 14:48
Copy link
Collaborator

@igiloh-pinecone igiloh-pinecone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miararoy see a few suggestions please

src/canopy_cli/data_loader/errors.py Outdated Show resolved Hide resolved
src/canopy_cli/data_loader/errors.py Outdated Show resolved Hide resolved
src/canopy_cli/data_loader/errors.py Outdated Show resolved Hide resolved
src/canopy_cli/data_loader/data_loader.py Show resolved Hide resolved
src/canopy_cli/data_loader/data_loader.py Show resolved Hide resolved
miararoy and others added 3 commits December 10, 2023 22:00
Co-authored-by: igiloh-pinecone <118673156+igiloh-pinecone@users.noreply.github.com>
@izellevy izellevy changed the title Improve dataloader error hadnling Improve dataloader error handling Dec 11, 2023
Copy link
Collaborator

@igiloh-pinecone igiloh-pinecone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miararoy miararoy added this pull request to the merge queue Dec 12, 2023
Merged via the queue into pinecone-io:main with commit ad09eae Dec 12, 2023
10 checks passed
@miararoy miararoy deleted the improve-dataloader-error-hadnling branch December 12, 2023 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Uninformative error message when upserting files where UTF-8 file decoding fails
2 participants