Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob labels returned #37

Merged
merged 8 commits into from
Jan 24, 2022
Merged

Blob labels returned #37

merged 8 commits into from
Jan 24, 2022

Conversation

gregordecristoforo
Copy link
Member

Implementation of blob labels as a separate variable.
Blob labels are required for using 2d_propagating_blobs to create test data for supervised machine learning.
Note that the current blob implementation returns the same label for all blobs. We can discuss together with Filippo whether it is advisable to implement different labels for different blobs in the future.

This is a first quick implementation, the following features should be added in the future:

  • analytically calculated max_amplitudes instead of estimated max amplitudes
    (current implementation with np.max())
  • animation function that shows density and blob labels at the same time

This commit provides an implementation of the blob labels as a separate variable.
Blob labels are required for using 2d_propagating_blobs to create test data for supervised machine learning.
This is a first quick implementation, the following features should be added in the future:
- tests for blob_labels
- analytically calculated max_amplitudes instead of estimated max amplitudes
 (current implementation with np.max())
- animation function that shows density and blob labels at the same time
this commit adds a test for writing out blob labels for both speed_up = True and speed_up = False in make_realizations()
@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #37 (213591a) into main (1857b95) will increase coverage by 1.84%.
The diff coverage is 97.36%.

❗ Current head 213591a differs from pull request most recent head 55cf9cf. Consider uploading reports for the commit 55cf9cf to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
+ Coverage   91.04%   92.88%   +1.84%     
==========================================
  Files           6        6              
  Lines         201      225      +24     
  Branches       25       35      +10     
==========================================
+ Hits          183      209      +26     
+ Misses         14       13       -1     
+ Partials        4        3       -1     
Impacted Files Coverage Δ
blobmodel/model.py 96.61% <97.29%> (+2.32%) ⬆️
blobmodel/plotting.py 73.33% <100.00%> (ø)
blobmodel/blobs.py 96.96% <0.00%> (+3.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1857b95...55cf9cf. Read the comment docs.

__max_amplitudes[__max_amplitudes == 0] = np.inf
__tmp = np.copy(__labels_field[:, :, __start:__stop])
__tmp[__single_blob >= __max_amplitudes * label_border] = 1
__labels_field[:, :, __start:__stop] = __tmp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create __tmp? Seems unnecessary:

__labels_field[:, :, __start:__stop][__single_blob >= __max_amplitudes * label_border] = 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find another way of implementing this since I am not aware of a simpler way of applying a condition only on a slice of the array.
i.e.
__label_field[:, :, __start:__stop && __single_blob >= __max_amplitudes * label_border]

@Sosnowsky do you know a simpler way in python? Otherwise I would keep the current implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't what I wrote work?
__labels_field[:, :, __start:__stop][__single_blob >= __max_amplitudes * label_border] = 1

Creating the tmp array is time-comsuming, so in general you would prefer to avoid doing it if it is not necessary.

__create_xr_dataset(), __sum_up_blobs() and __sum_up_blobs_speedup() are created to shorten make_realization() mehtod.
@gregordecristoforo
Copy link
Member Author

make_realization() is split into multiple private methods to increase readability

__sum_up_blobs() and  __sum_up_blobs_speedup() are replaced by _sum_up_blobs() and __compute_start_stop()
@gregordecristoforo
Copy link
Member Author

@Sosnowsky code is refactored as discussed in my office

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants