New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arni/extract henry coeff #100
Conversation
very nice and clean code!
|
I added a RMSE error between the points that were fitted to the slope for now. Error bounds for the slope would require us to assume some error for the datapoints no? |
The RMSE is smallest if you include two points, so based on RMSE, you would always choose two points to estimate the Henry coefficient. There is a tradeoff here though: include more points might mean more confidence in the slope estimate, but also might be less confidence if you are outside of the Henry regime. So how to systematically choose the number of points to include in a Henry coefficient calculation? The error estimate in the slope is based on Gaussian distributed noise, but the variance of that Gaussian is inferred from the data. See here: https://www.chem.utoronto.ca/coursenotes/analsci/stats/ErrRegr.html I'm not 100% sure the estimate of error in slope is the way to go, but certainly RMSE is not suitable on its own since that always leads to the conclusion to use only two points. Code looks good except in practice you need a more automatic guess for K and M; I suspect rarely will it converge to a global min with your default guess for K and M; it will get stuck in local min often. In pyIAST, I use the last data pt times 1.1 as the saturation loading, and use the first data points to get a Henry coefficient, giving K = M * KH. But you already thought of a better way to get a good starting point! you can keep your linearized function for Langmuir fitting. Use those as starting params for the nonlinear fitting routine! 👍 Maybe it would be beautiful to keep one function with |
@Surluson this might be stale, could you re-make a pull request now that we have Travis working? Also it looks like the Langmuir guess is poor starting point; we can do better than having a default guess of |
@Surluson checks failed.
|
It seems all tests failed because |
…one with Optim.jl now
@Surluson to get this PR merged how about we simplify it and
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@test isapprox(henry, 0.9688044280548711)
what is the rationale of why it should be 0.96? Can you make a comment in the test to explain?- what is the point of the
prepend!()
of zeros? - should we use
"H"
as a name for the Henry coefficient to distinguish it from"K"
since the Langmuir constant is not the Henry coefficient in the Langmuir model? - if
θ0["K0"]
is the guess returned from the guessing function, why are we multiplying it by 1.1 in theoptimize
function for the Henry coeff? Should we build the 1.1 into the guess function instead? n[2]/p[2]
etc should ben[2] / p[2]
for style, so we can't look at a piece of code and tell who wrote it- seems you can save some lines of code in
_guess
since the henry coefficient needs to be guessed for both models. - the henry constant in the langmuir model is not equal to K, so the guess for K0 for the Langmuir model is not correct, shouldn't it be multiplied by M? so as stated above you could get the henry coeff guess then multiply it by M for the Langmuir K. H = K * M in the Langmuir model if you write it out
I've made some edits to the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- what does
min_mse
do? - what does
henry_tol
do?
these seem to be artifacts of automatically choosing the number of points?
Yeah sorry, those shouldn't be there. They've been removed |
…tion_isotherm, make a few comments
please review my changes and approve, then we can merge the PR; it looks good to me! |
Everything looks good to me 👍 |
Added two functions (along with docstrings and tests):
extract_henry_coefficient
:Grabs a DataFrame, names of the Pressure column and Adsorption column, and the number of points it uses to extract the Henry coefficient.
I used MultivariateStats for a linear least squared regression (
llsq
). One issue I found was that thellsq
function doesn't like the columns in the DataFrames because they're usually a Union betweenFloat
s andMissing
s.fit_langmuir
:Here I linearize
N = (MKP)/(1 + KP)
and solve it withllsq
as well. I check to see if the first pressure point is 0, because after I linearize the I divide by the pressure (which blows up ifP=0
)