Skip to content

[java] BQ: Add avro schema to BQ TableSchema conversion#108

Closed
RustedBones wants to merge 100 commits intobq-avro-floatfrom
avro-to-bq-schema
Closed

[java] BQ: Add avro schema to BQ TableSchema conversion#108
RustedBones wants to merge 100 commits intobq-avro-floatfrom
avro-to-bq-schema

Conversation

@RustedBones
Copy link
Copy Markdown
Owner

Avro Schema to BQ TableSchema conversion.

kennknowles and others added 28 commits November 26, 2024 12:30
…che#33181)

* Add beam_PostCommit_Python_ValidatesDistrolessContainer_Dataflow

* Fix yaml format
… instead of seconds * 1000. This causes truncation. Also adds a test case that fails without this change.
* use direct executor to deflake tests

* address PR comments
* cleanup test files

* handle IOException

* handle Exception
Installing Beam Python could take a while.
…ow tests (apache#33216)

* make the model names unique

* fixed lint

* pin keras==3.6.0

* pin keras==2.12.0

* pin tf_keras==2.18.0

* pin tensorflow==2.18.0

* use 1 worker for pytest

* Fixed the tensorflow version
…e#32400)

* Refactored to allow inheritance and overriding of BasicAuthSempClient

* Fix docs and use Map#computeIfAbsent with a lambda.

* Fix integration test

* Remove 'serializable'

* Revert 'Remove 'serializable''
* fixed ML tests

* try some new setups

* created py312-ml tox section
* Create trigger json' file.

* Rename trigger file

* Create Java validates Distroless container workflow
* Fix python distroless workflow (apache#33228)

* Add artifact registry credential setup

* Edit trigger file to trigger workflow

* Add setup docker stage

* Add gcloud auth configure-docker stage

* Add registries to configure-docker step
* bump hadoop version

* add to readme.md
* Update beam_LoadTests_Python_Combine_Flink_Streaming.yml

Increase the number of workers.

* updated parallelism

* try high mem machines

* updated the script

* try n1

* fixed zone

* change heap size

* try prop

* more propts

* try more

* try props

* more opts

* more props

* props

* n1-highmem-8

* n1-highmem-16

* n1-highmem-32

* props

* move props

* fix heap

* task mem

* more mem

* updated props

* increase --max_cache_memory_usage_mb=256

* try new props

* more mem

* updated mem

* mem

* reduce size

* options

* reduce top count

* timeout

* mem

* fanout

* small

* added small load tests now

* restore old ones

* fixed args

* minor comments
* update 2.60.0 changelog to include a fix in Bigtable

* update main changelog

---------

Co-authored-by: Danny McCormick <dannymccormick@google.com>
… executed (apache#32962)

* SolaceIO.Read: handle occasional cases when finalizeCheckpoint is not called

* Implement cache for storing the session service and implement an eviction strategy and close services more eagerly.

* Wrap message acknowledgment in a try-catch block, remove active AtomicBool from the reader.

* Store messages in a Queue that is referenced from the CheckpointMark
* Set testJavaVersion in each run

* Modify trigger file

* Add dockerTag
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.30.0 to 0.31.0.
- [Commits](golang/net@v0.30.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…he#33126)

Bumps [cloud.google.com/go/spanner](https://github.com/googleapis/google-cloud-go) from 1.70.0 to 1.73.0.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](googleapis/google-cloud-go@spanner/v1.70.0...spanner/v1.73.0)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/spanner
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
damondouglas and others added 28 commits December 10, 2024 09:50
* Remove use of google-github-actions/auth step.

* Create beam_PostCommit_Java_IO_Performance_Tests.json
* add hadoop auth

* trigger xlang tests

* place dep in expansion service
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.31.0 to 0.32.0.
- [Commits](golang/net@v0.31.0...v0.32.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ks (apache#33352)

Bumps [github.com/nats-io/nats-server/v2](https://github.com/nats-io/nats-server) from 2.10.22 to 2.10.23.
- [Release notes](https://github.com/nats-io/nats-server/releases)
- [Changelog](https://github.com/nats-io/nats-server/blob/main/.goreleaser.yml)
- [Commits](nats-io/nats-server@v2.10.22...v2.10.23)

---
updated-dependencies:
- dependency-name: github.com/nats-io/nats-server/v2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#32674)

* bump confluent version

Kafka Schema Registry Client has been reported with following vuln 
CVE-2024-26308
CVE-2024-25710 due to vulnerable dependencies.

* try slighly older version due to unmet dependencies to ThrottlingQuotaExceededException

* try slighly older version due to unmet dependencies to ThrottlingQuotaExceededException

* comment on version
…e#33351)

Bumps [cloud.google.com/go/profiler](https://github.com/googleapis/google-cloud-go) from 0.4.1 to 0.4.2.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](googleapis/google-cloud-go@ai/v0.4.1...apps/v0.4.2)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/profiler
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…he#33327)

Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.47.0 to 1.48.0.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](googleapis/google-cloud-go@spanner/v1.47.0...spanner/v1.48.0)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…33339)

* Fix typehint in ReshufflePerKey on global window setting.

* Only update the type hint on global window setting. Need more work in non-global windows.

* Apply yapf

* Fix some failed tests.

* Revert change to setup.py
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.30.0 to 0.31.0.
- [Commits](golang/crypto@v0.30.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…sdks/python (apache#33325)

* Update numpy requirement in /sdks/python

Updates the requirements on [numpy](https://github.com/numpy/numpy) to permit the latest version.
- [Release notes](https://github.com/numpy/numpy/releases)
- [Changelog](https://github.com/numpy/numpy/blob/main/doc/RELEASE_WALKTHROUGH.rst)
- [Commits](numpy/numpy@v1.14.3...v2.2.0)

---
updated-dependencies:
- dependency-name: numpy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* increment in setup.py

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jack McCluskey <thejackmccluskey@gmail.com>
* Switch to use unshaded hive-exec for io expansion service

* This enables the shadow jar pick up dependencies of newer versions

* cleanup leftovers
…che#33365)

Bumps [cloud.google.com/go/bigquery](https://github.com/googleapis/google-cloud-go) from 1.64.0 to 1.65.0.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](googleapis/google-cloud-go@spanner/v1.64.0...spanner/v1.65.0)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/bigquery
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…3374)

Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.210.0 to 0.211.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/main/CHANGES.md)
- [Commits](googleapis/google-api-go-client@v0.210.0...v0.211.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* limit Apache snapshot repo content type

* move up Confluent repo
* feat : optimized SparkRunner batch groupByKey

* update CHANGES.md

* touch trigger files

* remove unused test
Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
* Remove use of static credentials

* Stage for adding back dataflow

* Remove unnecessary dataflow test
Signed-off-by: Jeffrey Kinard <jeff@thekinards.com>
…ache#33363)

* Fix typehint in ReshufflePerKey on global window setting.

* Only update the type hint on global window setting. Need more work in non-global windows.

* Apply yapf

* Fix some failed tests.

* Revert change to setup.py

* Fix custom coders not being used in reshuffle in non-global windows

* Revert changes in setup.py. Reformat.

* Make WindowedValue a generic class. Support its conversion to the correct type constraint in Beam.

* Cython does not support Python generic class. Add a subclass as a workroundand keep it un-cythonized.

* Add comments

* Fix type error.

* Remove the base class of WindowedValue in TypedWindowedValue.

* Move TypedWindowedValue out from windowed_value.py

* Revise the comments

* Fix the module location when matching.

* Fix test failure where __name__ of a type alias not found in python 3.9

* Add a note about the window coder.

---------

Co-authored-by: Robert Bradshaw <robertwb@gmail.com>
* Create use case for enriching spanner data with bigquery

End to end use case that demonstrates how spanner IO and enrichment transform coupled with other YAML transforms can be used in the real world

* Create example for bigtable enrichment

* Add project_id parameter to BigQueryWrapper

* minor changes

* remove project id being passed into bigquery wrapper

* add license

* add expected blocks

* Update examples_test.py

* Update examples_test.py

* fix formatting

* fix examples_test

Signed-off-by: Jeffrey Kinard <jeff@thekinards.com>

* Apply suggestions from code review

Co-authored-by: Jeff Kinard <jeff@thekinards.com>

* Update bigquery_tools.py

* Update bigquery_tools.py

---------

Signed-off-by: Jeffrey Kinard <jeff@thekinards.com>
Co-authored-by: Jeffrey Kinard <jeff@thekinards.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.