-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use tensorflow-macos and clean up some test running warning noise #741
Conversation
I'm seeing this test failure in CI:
This appears to also fail on
Also concerning, running |
true_sample_list = ( | ||
sorted(true_sample_set) | ||
if min_true_samples > 0 or sample_ids is None | ||
else list(true_sample_set) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concern about increased memory space as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JGSweets @taylorfturner I didn't do this just to fix the typing issue, I did it to fix the warning below which warns that set
s can't be used as indexers anymore (there's more discussion on why this was done here:
pandas-dev/pandas#42825):
/repos/DataProfiler/dataprofiler/profilers/profile_builder.py:610: FutureWarning: Passing a set as an indexer is deprecated and will raise in a future version. Use a list instead.
df_series = df_series.loc[true_sample_list]
I'm open to other suggestions of how to address this - I don't really know numpy/pandas best practices here but I'd guess there should be an idiomatic way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the counter argument is that sorted
was already going to convert it to a list and the memory was already being used whether or not it was "in-place".
Hence, this should be fine.
I believe this is a deprecated usage and we are required to now specify
The reason for varying failure could be platform based / numpy version. When i updated from |
@JGSweets Ah, got it. That makes sense. Regardless, the outer array doesn't need to be an
Have you guys considered using something like poetry to do the dependency management? It would simplify a bunch of stuff and add the capability to do a |
I am interested, but haven't had time to consider it yet. |
Dataprofiler doesn't install on MacOS on an M1 right now because the package
tensorflow
can't be installed. The package fordarwin
should betensorflow-macos
- https://developer.apple.com/metal/tensorflow-plugin/ - we could also installtensorflow-metal
? Not sure if we want that to be an optional install?When running the tests there's also a lot of noise in the test output. This PR fixes some of the lowest hanging fruit.
requests
: