Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added cutoff_error parameter to enable Automatic Linestrength Cutoff #451

Closed
wants to merge 5 commits into from
Closed

Added cutoff_error parameter to enable Automatic Linestrength Cutoff #451

wants to merge 5 commits into from

Conversation

sagarchotalia
Copy link
Collaborator

@sagarchotalia sagarchotalia commented Apr 14, 2022

Description

This PR is to address #268.
Added cutoff_error parameter that will allow users to input a cutoff error, which will then be used to keep the internally calculated error below the inputted cutoff error.
The way I am doing this currently is, decrementing the cutoff by 0.001 (arbitrary value assumed for now, will change it if needed) and then recalculating the bool array 'b' and then calculating the internal error again
The size of the boolean array 'b' shall decrease with each iteration until the internal error is less than the cutoff error.

Further Changes:

I am not sure that the approach I am using now fixes #268. If there is a more efficient way to decrease the internal error, kindly suggest the same in the comments.

Comment on lines 3216 to 3218
cutoff -= 0.0001
b = df.S <= cutoff
error = df.S[b].sum() / df.S.sum() * 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about :
1 . we sort (a copy of) the lines by intensity intensity
2. we compute the cumulated intensity on the sorted array [last value is the total sum]
3. we find the index in the sorted array so that cumulated-intensity is ERROR_fraction * the total

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this seems to be a much better option. I'll work on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know. It will be important to also add tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @sagarchotalia ,did you try the approach I suggested above ?

@codecov-commenter
Copy link

codecov-commenter commented Apr 14, 2022

Codecov Report

Merging #451 (06fbc5c) into develop (1cc969e) will decrease coverage by 0.08%.
The diff coverage is 42.85%.

@@             Coverage Diff             @@
##           develop     #451      +/-   ##
===========================================
- Coverage    77.11%   77.03%   -0.09%     
===========================================
  Files          161      161              
  Lines        18676    18689      +13     
===========================================
- Hits         14402    14397       -5     
- Misses        4274     4292      +18     

@erwanp
Copy link
Member

erwanp commented Apr 14, 2022

Also, see the error raised : https://app.travis-ci.com/github/radis/radis/jobs/567095871#L1858

@sagarchotalia
Copy link
Collaborator Author

Also, see the error raised : https://app.travis-ci.com/github/radis/radis/jobs/567095871#L1858

Yes, the error was resolved in the next commit.

Had accidentally changed the docstring in my previous commit; changed it back to raw string
Comment on lines 3216 to 3218
cutoff -= 0.0001
b = df.S <= cutoff
error = df.S[b].sum() / df.S.sum() * 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @sagarchotalia ,did you try the approach I suggested above ?

@erwanp erwanp marked this pull request as draft May 7, 2022 11:16
Copy link
Member

@erwanp erwanp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these operations exist as Numpy array methods. You should use them : it will make your code much more readable; and also much more efficient. List are pure Python objects; whereas Numpy arrays run underlying C code

code29563 added a commit to code29563/radis that referenced this pull request Mar 21, 2024
addressing issue radis#268 building on PR radis#451
erwanp pushed a commit to code29563/radis that referenced this pull request Apr 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Automatic linestrength cutoff
3 participants