Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Informative N Jobs Print #511

Merged
merged 14 commits into from Apr 26, 2019

Conversation

Projects
None yet
3 participants
@CharlesBradshaw
Copy link
Contributor

commented Apr 25, 2019

  • Fixes #508 (Print number of workers when running in parallel)

Also gives a warning if the user asks for more jobs than were created. For example if the user asks for 10 jobs and only 4 workers are created (either due to not enough cores or not enough tasks) a warning will show

Also gives a warning if the EntitySet is not scattered to all of the created workers.

@codecov

This comment has been minimized.

Copy link

commented Apr 25, 2019

Codecov Report

Merging #511 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #511      +/-   ##
==========================================
+ Coverage   96.08%   96.09%   +0.01%     
==========================================
  Files         108      109       +1     
  Lines        8867     8897      +30     
==========================================
+ Hits         8520     8550      +30     
  Misses        347      347
Impacted Files Coverage Δ
featuretools/computational_backends/utils.py 95.18% <100%> (+0.27%) ⬆️
...computational_backends/calculate_feature_matrix.py 97.08% <100%> (+0.1%) ⬆️
...utational_backend/test_calculate_feature_matrix.py 99.31% <100%> (ø) ⬆️
...ts/utils_tests/test_computational_backend_utils.py 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e9537ad...9a3b5ec. Read the comment docs.

@kmax12 kmax12 requested a review from rwedge Apr 25, 2019

@CharlesBradshaw CharlesBradshaw changed the title Updated print More Informative N Jobs Print Apr 25, 2019

CharlesBradshaw added some commits Apr 25, 2019

@@ -44,7 +48,17 @@ def int_es():
return make_ecommerce_entityset(with_integer_time_index=True)


def test_scatter_warning():
with warnings.catch_warnings(record=True) as w:

This comment has been minimized.

Copy link
@rwedge

rwedge Apr 25, 2019

Contributor

pytest.warns could probably work here



def test_create_client_and_cluster():
with warnings.catch_warnings(record=True) as w:

This comment has been minimized.

Copy link
@rwedge

rwedge Apr 25, 2019

Contributor

same comment as with test_scatter_warning

Show resolved Hide resolved featuretools/tests/computational_backend/test_calculate_feature_matrix.py
warning_string += " Not enough cpu cores({}).".format(cpu_workers)

if num_tasks < n_jobs:
warning_string += " Not enough tasks({}).".format(num_tasks)

This comment has been minimized.

Copy link
@rwedge

rwedge Apr 25, 2019

Contributor

Can you change this to " Not enough chunks (X), consider reducing the chunk size"

CharlesBradshaw added some commits Apr 25, 2019

@CharlesBradshaw CharlesBradshaw requested a review from rwedge Apr 25, 2019

scatter_string = "EntitySet scattered to workers in {:.3f} seconds"
print(scatter_string.format(scatter_time))

scatter_string = "EntitySet scattered to {} workers in {:.3f} seconds"

This comment has been minimized.

Copy link
@rwedge

rwedge Apr 25, 2019

Contributor

The issue wanted to remove the decimal part of the "in X seconds" message. Maybe round to integer and if it took less than 1 second display "<1" or "under 1"?

This comment has been minimized.

Copy link
@CharlesBradshaw

CharlesBradshaw Apr 25, 2019

Author Contributor

I added in the rounding but I decided to not add in the "<1" so that I don't have to deal with a complex code cov situation. In theory I could create a context manager to time it, and then run an on complete function which then prints out a special string but it feels too complex for a very minor piece of functionality.

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 25, 2019

Member

I think it'd be easiest to just round up to the near integer seconds. This is just diagnostic info, so +/- 1 second doesn't matter that much

scatter_string = "EntitySet scattered to workers in {:.3f} seconds"
print(scatter_string.format(scatter_time))

scatter_time = round(end - start)

This comment has been minimized.

Copy link
@rwedge

rwedge Apr 25, 2019

Contributor

instead of rounding up round rounds to the nearest integer. To avoid getting zero we could use a function that rounds up or take the min of the rounded number and 1

This comment has been minimized.

Copy link
@CharlesBradshaw

CharlesBradshaw Apr 26, 2019

Author Contributor

What is the motivation for wanting to tell the user it took 1 second if it actually took 0.001 seconds

This comment has been minimized.

Copy link
@rwedge

rwedge Apr 26, 2019

Contributor

I thought saying something took 0 seconds would seem strange to the user since clearly it took at least some time to do, but it's probably fine so let's just leave it as is.

@rwedge

rwedge approved these changes Apr 26, 2019

@CharlesBradshaw CharlesBradshaw merged commit f868d8e into master Apr 26, 2019

4 checks passed

codecov/patch 100% of diff hit (target 96.08%)
Details
codecov/project 96.09% (+0.01%) compared to e9537ad
Details
license/cla Contributor License Agreement is signed.
Details
test_all_python_versions Workflow: test_all_python_versions
Details

@CharlesBradshaw CharlesBradshaw deleted the update_n_jobs branch Apr 26, 2019

@rwedge rwedge referenced this pull request May 17, 2019

Merged

v0.8.0 #548

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.