-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convenient interface #3
base: base_PR
Are you sure you want to change the base?
Conversation
…tric into convenient_interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not personally familiar with pandas so I cannot comment in too much detail on how the data frames are used. However, there are a few things I noticed that could be changed or improved:
- All the function docstrings need more detail. Every parameter should have a description that includes the data type, how it is used in the function, and if it is a complicated structure it should include any information about required formatting. Additionally any value that is returned needs a similar description (data type and what it is).
- There are functions that appear to be very similar in writing, so they may be able to be combined into a more generalized function.
- For the tests, there should be a test for every function, even the small basic ones. But glad to see what looks like very thorough testing already!
- (not as important) There are several spelling errors in the docstrings, but it is not important to fix those now. Just an FYI so you can have someone check your grammar when this gets merged.
|
||
Parameters | ||
---------- | ||
evaler_ : evaler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the data type of evaler_? and what is its purpose? For all parameters, both the data type and a description of the parameter should be given.
---------- | ||
evaler_ : evaler | ||
fac_name : name of the facility | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no description of the return value provided
df = evaler_.eval('AgentEntry') | ||
df = df[df['Lifetime'] > 0] | ||
|
||
rdc_list = [] # because we want to get reed of the facility asap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reed ->> rid
Parameters | ||
---------- | ||
evaler_ : evaler | ||
fac_name : name of the facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fac_list is the parameter listed above, but not described here.
cymetric/convenient_interface.py
Outdated
return df | ||
|
||
|
||
def get_retirement_timeseries(evaler_, fac_list=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function and get_deployment_timeseries
seem similar enough that they could be generalized into a single function.
evaler_ : evaler | ||
fac_name : name of the facility | ||
nuc_list : list of nuclide to select. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment below about what information to include for the parameters and return values
|
||
df = get_inventory_df(evaler_, fac_list, nuc_list) | ||
for i, row in df.iterrows(): | ||
val = 1000 * data.N_A * df.ix[i, 'Quantity'] * \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiply by 1000.0 (not an integer) - python is weird with integers and this could cause other problems.
nuc_list = format_nuclist(nuc_list) | ||
rdc_list.append(['NucId', nuc_list]) | ||
else: | ||
wng_msg = "no nuclide provided" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since nuc_list gets passed through several functions before reaching this one, perhaps the warning message should come the very first time it is checked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is there a default given if nuc_list is not provided?
|
||
def get_transaction_nuc_df(evaler_, send_list=[], rec_list=[], commod_list=[], nuc_list=[]): | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove blank line here
evaler_ : evaler | ||
send_list : list of the sending facility | ||
rec_list : list of the receiving facility | ||
commod_list : list of the receiving facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are commod_list and rec_list different in any way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a first pass, will come look at this again next week!
evaler_ : evaler | ||
send_list : list of the sending facility | ||
rec_list : list of the receiving facility | ||
commod_list : list of the receiving facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo @kkiesling comment above on the typo
evaler_ : evaler | ||
send_list : list of the sending facility | ||
rec_list : list of the receiving facility | ||
commod_list : list of the receiving facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
evaler_ : evaler | ||
send_list : list of the sending facility | ||
rec_list : list of the receiving facility | ||
commod_list : list of the receiving facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
evaler_ : evaler | ||
send_list : list of the sending facility | ||
rec_list : list of the receiving facility | ||
commod_list : list of the receiving facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
evaler_ : evaler | ||
send_list : list of the sending facility | ||
rec_list : list of the receiving facility | ||
commod_list : list of the receiving facility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and here
base_col = ['SimId', 'SenderId'] | ||
added_col = base_col + ['Prototype'] | ||
trans = merge_n_drop(trans, base_col, send_agent, added_col) | ||
trans = trans.rename(index=str, columns={'Prototype': 'SenderProto'}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it would be better to spell out everything fully (SenderPrototype
and ReceiverProtoype
on line 127 below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full spellings are (I think) fully consistent throughout the cymetric metrics.
""" | ||
|
||
if len(nuc_list) != 0: | ||
nuc_list = format_nuclist(nuc_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I think if it is called only for the base functions that these are dependent on, you should be good.
nuc_list : list of nuclide to select. | ||
""" | ||
|
||
if len(nuc_list) != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that nuc_list is a filtering mechanism....but there is a warning in the ger_reduced_df function about having any of the filtering mechanisms empty: https://github.com/bam241/cymetric/pull/3/files#diff-908097c62f1520cddbe1acee59c44083R74
I think instead of that there should just be no filtering and all nucs/facilities/etc should be returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nice! I think the number of lines here could be greatly reduced with some refactoring of certain function patterns. Maybe there is a good reason for this though, I don't know.
The testing looks very nice. I didn't see any examples in which you pass data structures that are expected to cause failures. This is something I've found is often overlooked in testing code.
|
||
def merge_n_drop(df, base_col, add_df, add_col): | ||
""" | ||
Merge some additionnal columns fram an additionnal Pandas Data Frame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*additional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
onother -> with another
df : Pandas Data Frame | ||
base_col : list of the base columns names | ||
add_df : Pandas Data Frame to add in the df one | ||
add_col : columns tobe added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tobe -> to be
add_col : columns tobe added | ||
""" | ||
df = pd.merge(add_df[add_col], df, on=base_col) | ||
df.drop(base_col[1], 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base_col
is specified as a list. Why only drop the first entry in that list?
|
||
Parameters | ||
---------- | ||
evaler_ : evaler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This parameter could use a little more description. It seems to be the data frame to be filtered.
return df | ||
|
||
|
||
def get_transaction_decayheat_df(evaler_, send_list=[], rec_list=[], commod_list=[], nuc_list=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These get_transaction_*_df
functions could probably be refactored or made an additional option to get_transaction_df
.
return df | ||
|
||
|
||
def get_transaction_decayheat_timeserie(evaler_, send_list=[], rec_list=[], commod_list=[], nuc_list=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar refactor to the get_transaction_*_df
functions could happen here it seems.
return df | ||
|
||
|
||
def get_inventory_timeserie(evaler_, fac_list=[], nuc_list=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name looks incomplete here.
assert_frame_equal(cal, refs) | ||
|
||
|
||
@dbtest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about the purpose of the function decorators here. Maybe you can elaborate on them during the code review meeting?
assert_frame_equal(cal, refs) | ||
|
||
|
||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. Very convenient.
This PR aims to bring some convenient/easy methods to perform analysis on
Cyclus
output data in theCyclus
post-prossessing library,Cymetric
. The accessible informations accessible using cymetric are the grey bubble, which need to be combined to get easily understandable informations (like mass of U238 exchanged between 2 facilities as a function of the time).This allows users to get directly those informations without have to manipulate Panda Data Frames.
You can find here an small python notebook demonstrating some of this.
2 kind of pandas Data Frame can be returned:
_df
methods return a complicated panda data frame containing a lot of informations._timeseries
methods return a simple time-serie (time vs value) that is easy to understand/analyze/plot, if a single time occurs many time a sum is performed.In light green are the "standalone" methods return directly time-series formatted from the
Cyclus
output data.In blue and dark green, the method are slightly more complicated, they need to combine multiple informations from the the output file to produce the time series.
The data required to combine those are represented in light-grey bubble on the graph, and corresponds to the
evaler_.eval('XXXXXXX')
line in the different methods, which return a panda Data Frame containing the information.Those methods (blue and dark green) can be organized into 2 categories the inventories related methods and the transactions related methods. Because of the way those informations are written in the cycles output file they cannot be treated exactly in the same way.
If both corresponds to a list of materials (mass and isotopic compositions), either exchanged or stored in a facility, evolving during the simulation (time), the inventories tables contains already the isotopic composition of the different materials, and the transaction tables refers to the
materials
table which contains the isotopic compositions.In the same way,
Cymetric
(this analysis tool) already includes ways to compute Activity and DecayHeat tables for the transaction, but not for the inventories.Finally the red bubbles correspond to generic functions that are used multiple time. They are used either to shape nuclide list in the proper format, or to perform transformation on Panda Data Frame.
Note: The unit test are not finished yet, only few of them are implemented.