This happens for pipelines user_score.py and hourly_team_score.py
Jobs complete successfully but initial size estimation fails with following error.
INFO:root:Could not estimate size of source <apache_beam.io.textio._TextSource object at 0x03E34FF0> due to an exception
: Traceback (most recent call last):
File "C:\Users\chamikara\pythontest1\rc4_test\env_rc4_1\lib\site-packages\apache_beam\runners\dataflow\dataflow_runner
.py", line 554, in run_Read
transform.source.estimate_size())
File "C:\Users\chamikara\pythontest1\rc4_test\env_rc4_1\lib\site-packages\apache_beam\internal\gcp\json_value.py", lin
e 59, in get_typed_value_descriptor
raise TypeError('Cannot get a type descriptor for %s.' % repr(obj))
TypeError: Cannot get a type descriptor for 23899840340L.
Seems to be due to a combination of Windows returning Long for type name [1] and SDK not handling long type at [2].
[1] http://stackoverflow.com/questions/22513445/python-handles-long-ints-differently-on-windows-and-unix
[2] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/internal/gcp/json_value.py#L35
cc: [~altay]
Imported from Jira BEAM-2294. Original Jira may contain additional context.
Reported by: chamikara.
This happens for pipelines user_score.py and hourly_team_score.py
Jobs complete successfully but initial size estimation fails with following error.
INFO:root:Could not estimate size of source <apache_beam.io.textio._TextSource object at 0x03E34FF0> due to an exception
: Traceback (most recent call last):
File "C:\Users\chamikara\pythontest1\rc4_test\env_rc4_1\lib\site-packages\apache_beam\runners\dataflow\dataflow_runner
.py", line 554, in run_Read
transform.source.estimate_size())
File "C:\Users\chamikara\pythontest1\rc4_test\env_rc4_1\lib\site-packages\apache_beam\internal\gcp\json_value.py", lin
e 59, in get_typed_value_descriptor
raise TypeError('Cannot get a type descriptor for %s.' % repr(obj))
TypeError: Cannot get a type descriptor for 23899840340L.
Seems to be due to a combination of Windows returning Long for type name [1] and SDK not handling long type at [2].
[1] http://stackoverflow.com/questions/22513445/python-handles-long-ints-differently-on-windows-and-unix
[2] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/internal/gcp/json_value.py#L35
cc: [~altay]
Imported from Jira BEAM-2294. Original Jira may contain additional context.
Reported by: chamikara.