Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Bulk insert support fp16/bf16 #32157

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

cydrain
Copy link
Contributor

@cydrain cydrain commented Apr 11, 2024

Issue: #22837

@sre-ci-robot sre-ci-robot added size/XL Denotes a PR that changes 500-999 lines. area/test sig/testing test/integration integration test labels Apr 11, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/feature Issues related to feature request from users labels Apr 11, 2024
Copy link
Contributor

mergify bot commented Apr 11, 2024

@cydrain ut workflow job failed, comment rerun ut can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 11, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 11, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@cydrain
Copy link
Contributor Author

cydrain commented Apr 11, 2024

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Apr 11, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 11, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@bigsheeper
Copy link
Contributor

@xiaocai2333 please help to review the parquet part

Copy link

codecov bot commented Apr 15, 2024

Codecov Report

Attention: Patch coverage is 78.26087% with 40 lines in your changes are missing coverage. Please review.

Project coverage is 81.82%. Comparing base (66fae53) to head (f3009dd).
Report is 12 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #32157      +/-   ##
==========================================
+ Coverage   81.76%   81.82%   +0.05%     
==========================================
  Files         999      999              
  Lines      124008   124166     +158     
==========================================
+ Hits       101400   101597     +197     
+ Misses      18739    18694      -45     
- Partials     3869     3875       +6     
Files Coverage Δ
pkg/util/typeutil/schema.go 85.74% <100.00%> (+0.11%) ⬆️
internal/util/importutilv2/json/row_parser.go 45.56% <33.33%> (ø)
internal/util/importutilv2/numpy/field_reader.go 60.82% <80.00%> (+0.60%) ⬆️
tests/integration/util_query.go 82.35% <77.77%> (+4.09%) ⬆️
internal/storage/insert_data.go 88.33% <82.35%> (+1.22%) ⬆️
internal/util/importutilv2/parquet/field_reader.go 45.06% <76.19%> (+2.34%) ⬆️
internal/util/importutilv2/numpy/util.go 38.32% <12.50%> (-1.06%) ⬇️
internal/util/importutilv2/parquet/util.go 73.59% <83.18%> (+14.77%) ⬆️

... and 48 files with indirect coverage changes

Copy link
Contributor

mergify bot commented Apr 15, 2024

@cydrain ut workflow job failed, comment rerun ut can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 15, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 15, 2024

@cydrain ut workflow job failed, comment rerun ut can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 15, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Apr 16, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@cydrain
Copy link
Contributor Author

cydrain commented Apr 17, 2024

/run-cpu-e2e

Copy link
Contributor

mergify bot commented Apr 17, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@cydrain
Copy link
Contributor Author

cydrain commented Apr 17, 2024

/run-cpu-e2e

@bigsheeper
Copy link
Contributor

/lgtm

Copy link
Contributor

mergify bot commented Apr 17, 2024

@cydrain E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

@xiaocai2333 xiaocai2333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Losing some flexibility will lead to some compatibility issues. It is best to be compatible with the previous logic, or it needs to be discussed.

internal/util/importutilv2/parquet/util.go Outdated Show resolved Hide resolved
internal/util/importutilv2/parquet/util.go Show resolved Hide resolved
return merr.WrapErrImportFailed(
fmt.Sprintf("field name '%s' mis-match with arrow field name '%s", field.Name, arrField.Name))
}
toArrDataType, err := convertToArrowDataType(field, false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can implement a function named CanConverted(src *schempb.DataType, dst arrow.DataType) bool ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (schemapb.Binary_vector, arrow.Binary), (schemapb.Binary_vector, arrow.Uin8 list), both return true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's difficult to implement CanConverted(src *schempb.DataType, dst arrow.DataType) bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for (schemapb.Binary_vector, arrow.Binary), (schemapb.Binary_vector, arrow.Uin8 list), both return true.

what's the expected behavior ?

@cydrain
Copy link
Contributor Author

cydrain commented Apr 19, 2024

/rerun ut

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
@bigsheeper
Copy link
Contributor

/lgtm

@congqixia
Copy link
Contributor

/approve

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: congqixia, cydrain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cydrain cydrain added ci-passed manual-pass manually set pass before ci-passed labeled labels Apr 22, 2024
@sre-ci-robot sre-ci-robot merged commit 5fc439c into milvus-io:master Apr 22, 2024
14 of 15 checks passed
@cydrain cydrain deleted the caiyd_enable_bf16_test branch April 22, 2024 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/test ci-passed dco-passed DCO check passed. kind/feature Issues related to feature request from users lgtm manual-pass manually set pass before ci-passed labeled sig/testing size/XL Denotes a PR that changes 500-999 lines. test/integration integration test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants