Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix dataflow submission #1691

Merged
merged 16 commits into from
Mar 10, 2022
Merged

fix dataflow submission #1691

merged 16 commits into from
Mar 10, 2022

Conversation

nbren12
Copy link
Contributor

@nbren12 nbren12 commented Mar 8, 2022

master is failing because the dependencies in workflows/dataflow/setup.py were incompatible with
constraints.txt. Dataflow now supports running via a docker image, which will be
easier to ensure a consistent environment.

  • switch to using docker for dataflow jobs
  • Switch to latest dataflow version. Our current one has been deprecated.
  • ensure dependencies are compatible with the versions installed in the dataflow
    base image

These dependencies mostly match our stack, but did force some updates to
libraries in the google stack like tensorflow and the google-cloud-* packages.

Resolves #1678

The dependencies in `workflows/dataflow/setup.py` were incompatible with
constraints.txt. Dataflow now supports running via a docker image, which will be
easier to ensure a consistent environment.

* switch to using docker for dataflow jobs
* ensure dependencies are compatible with the versions installed in the dataflow
  base image

These dependencies mostly match our stack, but did force some updates to
libraries in the google stack like tensorflow and the google-cloud-* packages.

Resolves #1678
* regression data needed updating (probably bc of new numpy version)
* fix newly enforced behavior that np.pad cannot pad int data with
np.nan
@nbren12
Copy link
Contributor Author

nbren12 commented Mar 8, 2022

Looks like we need to

  • update prognostic run regression data

  • update make test_regression data. @brianhenn I actually cannot run these tests locally. I get this complaint:

    E _pytest.config.ConftestImportFailure: (local('/home/noahb/workspace/ai2cm/fv3net/_build/pip_packages/lib/python3.8/site-packages/matplotlib/tests/conftest.py'), (<class 'OSError'>, OSError('The baseline image directory does not exist. This is most likely because the test data is not installed. You may need to install matplotlib from source to get the test data.'), <traceback object at 0x7fbddd98eb00>))

    Do you have a workaround?

@nbren12
Copy link
Contributor Author

nbren12 commented Mar 8, 2022

@spencerkclark The regression tests are failing because some of the coarsening routines want to pad an array of ints with
(np.nan), an operation that I don't think newer versions of numpy allows any longer. The int arrays are usually the coordinates.

I've been tracking down a lot of errors with this PR, so I would definitely would appreciate your help fixing that. Feel free to push directly here if you have any quick fixes.

spencerkclark and others added 2 commits March 9, 2022 18:51
This ensures that the dummy coordinates we construct during the
vertical remapping process have dtype np.float32 from the start.
Our coarsening procedure of these coordinates requires they be
of a floating point type.
for some reason these tests fail when run via the `coverage` utility. We don't
look at coverage stats much, so I took the expedient option and removed it for
this small number of tests.
@nbren12
Copy link
Contributor Author

nbren12 commented Mar 9, 2022

The test_unit failure looks like an unrelated network access issue.

Copy link
Contributor Author

@nbren12 nbren12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All new deps are ok.

@@ -18,93 +18,110 @@ astor==0.8.1
astunparse==1.6.3
async-generator==1.10
async-timeout==3.0.1
attrs==19.3.0
avro-python3==1.9.2.1
atomicwrites==1.4.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIT

babel==2.8.0
backcall==0.2.0
backoff==1.10.0
base58==2.1.0
beautifulsoup4==4.10.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mit

bleach==3.3.0
blinker==1.4
bokeh==2.2.3
bs4==0.0.1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a real package. is related to beautifulsoup4 above. https://pypi.org/project/bs4/

cffi==1.14.4
cfgv==3.2.0
cftime==1.2.1
chardet==3.0.4
charset-normalizer==2.0.11
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MIT

colorcet==2.0.6
commonmark==0.9.1
conda-lock==0.4.1
configparser==5.0.2
crcmod==1.7
cryptography==36.0.1
cycler==0.10.0
cython==0.29.27
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apache

testpath==0.4.4
tf-estimator-nightly==2.8.0.dev2021122109
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apache 2

testpath==0.4.4
tf-estimator-nightly==2.8.0.dev2021122109
threadpoolctl==3.1.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bsd 3

typing-inspect==0.6.0
typing-extensions==4.0.1
typing-inspect==0.7.1
typing-utils==0.1.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apache 2

tzlocal==2.1
urllib3==1.25.10
uritemplate==4.1.1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bsd-3 or apache 2

xarray==0.19.0
xgcm==0.6.1
xmltodict==0.12.0
xpartition==0.2.0
yarl==1.6.3
yq==2.11.0
zarr==2.7.0
zipp==3.7.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mit

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Nice that we can use a docker image for dataflow now.

@@ -76,13 +76,10 @@ def test_MultiDatasetMapper_keys(multi_dataset_mapper, expected_keys):


def test_MultiDatasetMapper_value(multi_dataset_mapper, datasets):
single_time = datasets[0][TIME_NAME].isel({TIME_NAME: 0}).item()
time_key = pd.to_datetime(single_time).strftime(TIME_FMT)
single_time = pd.to_datetime(datasets[0][TIME_NAME].isel({TIME_NAME: 0}).item())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, the fact that calling item() on a np.datetime64 array returns an integer rather than a np.datetime64 scalar is weird, but a known issue (pydata/xarray#3256). This is definitely the right thing to do (I probably should have written things this way to begin with).

I think with this change we could probably still use sel down below, but the meaning of this test is still the same with isel, so there's no reason to change it. Interesting that it worked before, but now no longer works (I guess pandas used to automatically cast integer indexers to a DatetimeIndex to np.datetime64 values, and now it no longer does).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it was able to look things up with the int. Definitely outside of my expertise.

@nbren12 nbren12 merged commit 00851a3 into master Mar 10, 2022
@nbren12 nbren12 deleted the beam-container branch March 10, 2022 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dataflow tests failing
2 participants