Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ch.4 - Time Correction - input/output error when running df02.py #162

Open
jgammerman opened this issue Dec 22, 2022 · 3 comments
Open

Ch.4 - Time Correction - input/output error when running df02.py #162

jgammerman opened this issue Dec 22, 2022 · 3 comments

Comments

@jgammerman
Copy link

jgammerman commented Dec 22, 2022

Hi! I'm getting the following error when running df02.py (df01.py worked fine) - any advice please?

(beam_env) jgammerman@cloudshell:~/data-science-on-gcp/04_streaming/transform (peppy-booth-371612)$ python3 ./df02.py Traceback (most recent call last): File "apache_beam/runners/common.py", line 1417, in apache_beam.runners.common.DoFnRunner.process File "apache_beam/runners/common.py", line 624, in apache_beam.runners.common.SimpleInvoker.invoke_process File "/home/jgammerman/beam_env/lib/python3.9/site-packages/apache_beam/transforms/core.py", line 1879, in <lambda> File "/home/jgammerman/data-science-on-gcp/04_streaming/transform/./df02.py", line 39, in <lambda> File "/home/jgammerman/data-science-on-gcp/04_streaming/transform/./df02.py", line 24, in addtimezone File "/home/jgammerman/beam_env/lib/python3.9/site-packages/timezonefinder/timezonefinder.py", line 260, in __init__ File "/home/jgammerman/beam_env/lib/python3.9/site-packages/timezonefinder/timezonefinder.py", line 92, in __init__ OSError: [Errno 5] Input/output error: '/home/jgammerman/beam_env/lib/python3.9/site-packages/timezonefinder/poly_zone_ids.bin'

Followed by some more output (omitting for brevity) which ends as follows:

RuntimeError: OSError: [Errno 5] Input/output error: '/home/jgammerman/beam_env/lib/python3.9/site-packages/timezonefinder/poly_zone_ids.bin' [while running 'Map(<lambda at df02.py:39>)'] Exception ignored in: <function AbstractTimezoneFinder.__del__ at 0x7f1142a9a1f0> Traceback (most recent call last): File "/home/jgammerman/beam_env/lib/python3.9/site-packages/timezonefinder/timezonefinder.py", line 97, in __del__ AttributeError: poly_zone_ids

@jgammerman
Copy link
Author

jgammerman commented Dec 23, 2022

Update -I've run df02.py twice more, and now I'm getting an out-of-memory error after nearly an hour of running:

Bus error (core dumped)

Same story with df03.py, but df04.py seemed to work okay.

Is anyone else having this problem? And is it supposed to take 45-60 mins to run each file?

@luisandrecunha
Copy link

@jgammerman were you able to solve the long time issue to run the beam pipelines? I'm having the same situation with df04.py, I even reduced the number of flights to apply the transformation to 100, and still, no luck!

@jgammerman
Copy link
Author

jgammerman commented Nov 14, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants