Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test file formats #1392 #1768

Merged
merged 16 commits into from
Feb 2, 2021
Merged

Test file formats #1392 #1768

merged 16 commits into from
Feb 2, 2021

Conversation

vitaliizazmic
Copy link
Contributor

@vitaliizazmic vitaliizazmic commented Jan 21, 2021

How

  • Create file samples
  • Check uploading each of them from Local storage
  • Remove html, orc and pickle from available formats in spec.json
  • Create share test case for all formats to pull file

Pre-merge Checklist

  • Run integration tests
  • Publish Docker images

@vitaliizazmic
Copy link
Contributor Author

vitaliizazmic commented Jan 21, 2021

/test connector=source-file

🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/502087916
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/502087916
🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/502087916
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/502087916

@sherifnada
Copy link
Contributor

@vitaliizazmic I added custom integration tests to CI -- can you make sure those tests are succeeding? you can also run them locally by running python -m pytest -s integration_tests from the module directory with the venv activated, will be much faster for iteration

@sherifnada
Copy link
Contributor

sherifnada commented Jan 27, 2021

@vitaliizazmic could you pull in @eugene-kulak 's changes from #1712 and ping me when this is ready for review? All the changes he needs to make are stylistic so I think you should be able to build on top of it. One thing to keep in mind is that you should probably put your integration tests in a separate test file for easier reading.

# Conflicts:
#	airbyte-integrations/connectors/source-file/build.gradle
#	airbyte-integrations/connectors/source-file/integration_tests/integration_source_test.py
#	airbyte-integrations/connectors/source-file/setup.py
#	docs/integrations/sources/file.md
@sherifnada
Copy link
Contributor

sherifnada commented Jan 29, 2021

/test connector=source-file

🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/520191436
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/520191436
🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/520191436
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/520191436

Copy link
Contributor

@sherifnada sherifnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments, will merge tomorrow

@@ -67,7 +67,7 @@ File Formats are mostly enabled \(and further tested\) thanks to other open-sour
| :--- | :--- |
| CSV | Yes |
| JSON | Experimental |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this one still listed as experimental?

@@ -0,0 +1,21 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't support parquet/orc/pickle can we remove them and any catalogs/sample files that are not supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada we support parquet and pickle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pickle works if it contains DataFrame. Because read_pickle() method returns same type as object stored in file. And discover and read methods of client expect for DataFrame.

@@ -34,6 +34,12 @@
"paramiko==2.7.2",
"s3fs==0.5.2",
"smart-open[all]==4.1.2",
"lxml==4.6.2",
"html5lib==1.1",
"beautifulsoup4==4.9.3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are all of these libs required? html/BS/pyarrow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This libraries required for read different file formats.

@sherifnada
Copy link
Contributor

@vitaliizazmic once you've addressed the comments could you:

  1. pull master ( i merged eugene's bugfix for source-file)
  2. bump the connector version
  3. run the test command

I'll publish and merge. Thanks!

@@ -66,13 +66,13 @@ File Formats are mostly enabled \(and further tested\) thanks to other open-sour
| Format | Supported? |
| :--- | :--- |
| CSV | Yes |
| JSON | Experimental |
| HTML | Untested |
| JSON | Yes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally put it as 'Experimental' because Source-File couldn't handle nested JSON files
(flat JSON file worked)

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/publish connector=connectors=source-file

Error: Workflow does not have 'workflow_dispatch' trigger

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/publish connector=connectors=source-file

❌ connectors=source-file https://github.com/airbytehq/airbyte/actions/runs/529124974

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/publish connector=connectors/source-file

🕑 connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/529143991
❌ connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/529143991

@sherifnada
Copy link
Contributor

sherifnada commented Feb 1, 2021

/test connector=source-file

🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/529174443
❌ source-file https://github.com/airbytehq/airbyte/actions/runs/529174443
🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/529174443
❌ source-file https://github.com/airbytehq/airbyte/actions/runs/529174443
🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/529174443
❌ source-file https://github.com/airbytehq/airbyte/actions/runs/529174443

@sherifnada
Copy link
Contributor

@vitaliizazmic tests are failing, can you take a look ?

@vitaliizazmic
Copy link
Contributor Author

vitaliizazmic commented Feb 2, 2021

/test connector=source-file

🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/532068832
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/532068832

@sherifnada
Copy link
Contributor

sherifnada commented Feb 2, 2021

/publish connector=connectors/source-file

🕑 connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/532163094
✅ connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/532163094

@sherifnada sherifnada merged commit caf673d into master Feb 2, 2021
@sherifnada sherifnada deleted the vitalii/1392_test_file_formats branch February 2, 2021 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants