Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

".google.protobuf.Timestamp" is not defined #257

Closed
alexsanderp opened this issue Jul 29, 2021 · 14 comments
Closed

".google.protobuf.Timestamp" is not defined #257

alexsanderp opened this issue Jul 29, 2021 · 14 comments
Labels
api: bigquerystorage Issues related to the googleapis/python-bigquery-storage API. type: question Request for information or clarification. Not an issue.

Comments

@alexsanderp
Copy link

alexsanderp commented Jul 29, 2021

I created a proto message using proto-plus and it has a timestamp type field. The field uses timestamp_pb2.Timestamp.
When calling the append_rows function from bigquery_write_client, the error occurs:

Error:

2021-07-29 19:10:20,519[ERROR]: [MSG]400 Invalid proto schema: BqMessage.proto: protoSchema.timestamp: ".google.protobuf.Timestamp" is not defined.
BqMessage.proto: protoSchema.processed_at: ".google.protobuf.Timestamp" is not defined. Entity: {hidden}
2021-07-29 19:10:20,522[ERROR]: [MSG]Traceback (most recent call last):
  File "/home/alex/.local/share/virtualenvs/{hidden}/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 160, in error_remapped_callable
    return _StreamingResponseIterator(
  File "/home/alex/.local/share/virtualenvs/{hidden}/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 83, in __init__
    self._stored_first_result = six.next(self._wrapped)
  File "/home/alex/.local/share/virtualenvs/{hidden}/lib/python3.8/site-packages/grpc/_channel.py", line 426, in __next__
    return self._next()
  File "/home/alex/.local/share/virtualenvs/{hidden}/lib/python3.8/site-packages/grpc/_channel.py", line 826, in _next
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "Invalid proto schema: BqMessage.proto: protoSchema.timestamp: ".google.protobuf.Timestamp" is not defined.
BqMessage.proto: protoSchema.processed_at: ".google.protobuf.Timestamp" is not defined. Entity: {hidden}"
	debug_error_string = "{"created":"@1627596620.519395176","description":"Error received from peer ipv6:[{hidden}]:443","file":"src/core/lib/surface/call.cc","file_line":1069,"grpc_message":"Invalid proto schema: BqMessage.proto: protoSchema.timestamp: ".google.protobuf.Timestamp" is not defined.\nBqMessage.proto: protoSchema.processed_at: ".google.protobuf.Timestamp" is not defined. Entity: {hidden}","grpc_status":3}"

Example:

import proto
from google.protobuf import timestamp_pb2

class protoSchema(proto.Message):
	processed_at = proto.Field(timestamp_pb2.Timestamp, number=1)

I am using timestamp_pb2 as some timestamps may come in INT64 or STRING format.

@tswast Can you help me?

@product-auto-label product-auto-label bot added the api: bigquerystorage Issues related to the googleapis/python-bigquery-storage API. label Jul 29, 2021
@yoshi-automation yoshi-automation added triage me I really want to be triaged. 🚨 This issue needs some love. labels Aug 1, 2021
@tswast tswast added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Aug 4, 2021
@tswast
Copy link
Contributor

tswast commented Aug 4, 2021

@software-dov Does this look familiar to you? Is there a known issue with proto-plus and timestamp fields?

@alexsanderp
Copy link
Author

@tswast Proto plus works normally to create the proto message. The error occurs during append_rows of the bigquery write client.

@tswast
Copy link
Contributor

tswast commented Aug 5, 2021

Are you using the proto3 wire format? I heard from some backend engineers, that proto2 is better supported.

@alexsanderp
Copy link
Author

@tswast Proto plus only handles proto3. I don't see another way to create a proto message at runtime. Because I use BigQuery schema and convert to proto message.

@tswast
Copy link
Contributor

tswast commented Aug 11, 2021

Per https://cloud.google.com/bigquery/docs/write-api#data_type_conversions

TIMESTAMP columns should be represented as int64 in protobuf, where "The value is given in microseconds since the Unix epoch (1970-01-01)."

Unfortunately the backend does not yet support Well-Known-Types such as Timestamp, even with proto2. :-(

@tswast tswast added type: question Request for information or clarification. Not an issue. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Aug 11, 2021
@tswast
Copy link
Contributor

tswast commented Aug 12, 2021

Here's an example of writing a timestamp from a code sample I'm working on:

    row = sample_data_pb2.SampleData()
    timestamp_value = datetime.datetime(
        2021, 8, 12, 16, 11, 22, 987654, tzinfo=datetime.timezone.utc
    )
    epoch_value = datetime.datetime(1970, 1, 1, tzinfo=datetime.timezone.utc)
    delta = timestamp_value - epoch_value
    row.timestamp_col = int(delta.total_seconds()) * 1000000 + int(delta.microseconds)
    proto_rows.serialized_rows.append(row.SerializeToString()

@alexsanderp
Copy link
Author

Got it, I'm having difficulty using the write API.
When using the job or stream API, you just send the data and bigquery handles all the conversions.
Is writing so many manual things to send data the purpose of this API?

@alexsanderp
Copy link
Author

I've been trying to use it for over a month :(

@tswast
Copy link
Contributor

tswast commented Aug 13, 2021

Is writing so many manual things to send data the purpose of this API?

The backend team built an API optimized for high throughput, and their choices in row serialization format reflect that. Unfortunately it does mean a lot more client-side work. :-(

Thank you for trying and sharing your experience. I will use this thread to try and advocate for more friendly options.

@alexsanderp
Copy link
Author

Thanks @tswast. Pronto Plus support may suffice.
Because ploto plus already has the type of timestamp that handles many cases.

@tswast
Copy link
Contributor

tswast commented Nov 11, 2021

Closing, as this is a limitation of the BigQuery Storage Write API. There's not much we can do in the client to make protobuf Timestamp well-known-types work, unfortunately.

@tswast tswast closed this as completed Nov 11, 2021
@frederickmannings
Copy link

frederickmannings commented Aug 1, 2023

@alexsanderp I've been having a similar problem. This solution may be a bit outdated, but if I'm having the same issue now I am sure there are others...

Having wrestled with this for days, and working through some alternative options this is the best I have come up with:

  1. Create a bq table with string type for your timestamp field, and override the type to TIMESTAMP. I'd recommend using the gen_bq_schema util to do this (you would then use the generated schema to create/update your bq table). The option to overide your timestamp will looks something like the following when defined in your .proto file:
string timestamp = 5 [
    (gen_bq_schema.bigquery) = {require : true type_override : 'TIMESTAMP'}
  ];
  1. When you construct your protobuffer to send the message to big query, convert your datetime to string following the ISO format. In python, just isoformat().

Hope this helps.

@ostash
Copy link

ostash commented Mar 29, 2024

I'm wondering whether this should be reopened. https://cloud.google.com/bigquery/docs/write-api#data_type_conversions says that supported protocol buffer types for TIMESTAMP are:

int64 (preferred), int32, uint32, google.protobuf.Timestamp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the googleapis/python-bigquery-storage API. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

5 participants