Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convenient interface #3

Open
wants to merge 67 commits into
base: base_PR
Choose a base branch
from
Open

convenient interface #3

wants to merge 67 commits into from

Conversation

bam241
Copy link
Owner

@bam241 bam241 commented May 18, 2017

This PR aims to bring some convenient/easy methods to perform analysis on Cyclus output data in the Cyclus post-prossessing library, Cymetric. The accessible informations accessible using cymetric are the grey bubble, which need to be combined to get easily understandable informations (like mass of U238 exchanged between 2 facilities as a function of the time).
This allows users to get directly those informations without have to manipulate Panda Data Frames.
You can find here an small python notebook demonstrating some of this.

2 kind of pandas Data Frame can be returned:

  • the _df methods return a complicated panda data frame containing a lot of informations.
  • the _timeseries methods return a simple time-serie (time vs value) that is easy to understand/analyze/plot, if a single time occurs many time a sum is performed.

In light green are the "standalone" methods return directly time-series formatted from the Cyclus output data.

In blue and dark green, the method are slightly more complicated, they need to combine multiple informations from the the output file to produce the time series.

The data required to combine those are represented in light-grey bubble on the graph, and corresponds to the evaler_.eval('XXXXXXX') line in the different methods, which return a panda Data Frame containing the information.

Those methods (blue and dark green) can be organized into 2 categories the inventories related methods and the transactions related methods. Because of the way those informations are written in the cycles output file they cannot be treated exactly in the same way.
If both corresponds to a list of materials (mass and isotopic compositions), either exchanged or stored in a facility, evolving during the simulation (time), the inventories tables contains already the isotopic composition of the different materials, and the transaction tables refers to the materials table which contains the isotopic compositions.
In the same way, Cymetric (this analysis tool) already includes ways to compute Activity and DecayHeat tables for the transaction, but not for the inventories.

Finally the red bubbles correspond to generic functions that are used multiple time. They are used either to shape nuclide list in the proper format, or to perform transformation on Panda Data Frame.

convienient

Note: The unit test are not finished yet, only few of them are implemented.

Copy link

@kkiesling kkiesling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not personally familiar with pandas so I cannot comment in too much detail on how the data frames are used. However, there are a few things I noticed that could be changed or improved:

  • All the function docstrings need more detail. Every parameter should have a description that includes the data type, how it is used in the function, and if it is a complicated structure it should include any information about required formatting. Additionally any value that is returned needs a similar description (data type and what it is).
  • There are functions that appear to be very similar in writing, so they may be able to be combined into a more generalized function.
  • For the tests, there should be a test for every function, even the small basic ones. But glad to see what looks like very thorough testing already!
  • (not as important) There are several spelling errors in the docstrings, but it is not important to fix those now. Just an FYI so you can have someone check your grammar when this gets merged.


Parameters
----------
evaler_ : evaler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the data type of evaler_? and what is its purpose? For all parameters, both the data type and a description of the parameter should be given.

----------
evaler_ : evaler
fac_name : name of the facility
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no description of the return value provided

df = evaler_.eval('AgentEntry')
df = df[df['Lifetime'] > 0]

rdc_list = [] # because we want to get reed of the facility asap

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reed ->> rid

Parameters
----------
evaler_ : evaler
fac_name : name of the facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fac_list is the parameter listed above, but not described here.

return df


def get_retirement_timeseries(evaler_, fac_list=[]):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function and get_deployment_timeseries seem similar enough that they could be generalized into a single function.

evaler_ : evaler
fac_name : name of the facility
nuc_list : list of nuclide to select.
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below about what information to include for the parameters and return values


df = get_inventory_df(evaler_, fac_list, nuc_list)
for i, row in df.iterrows():
val = 1000 * data.N_A * df.ix[i, 'Quantity'] * \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiply by 1000.0 (not an integer) - python is weird with integers and this could cause other problems.

nuc_list = format_nuclist(nuc_list)
rdc_list.append(['NucId', nuc_list])
else:
wng_msg = "no nuclide provided"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since nuc_list gets passed through several functions before reaching this one, perhaps the warning message should come the very first time it is checked?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a default given if nuc_list is not provided?


def get_transaction_nuc_df(evaler_, send_list=[], rec_list=[], commod_list=[], nuc_list=[]):
"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove blank line here

evaler_ : evaler
send_list : list of the sending facility
rec_list : list of the receiving facility
commod_list : list of the receiving facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are commod_list and rec_list different in any way?

Copy link

@opotowsky opotowsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a first pass, will come look at this again next week!

evaler_ : evaler
send_list : list of the sending facility
rec_list : list of the receiving facility
commod_list : list of the receiving facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echo @kkiesling comment above on the typo

evaler_ : evaler
send_list : list of the sending facility
rec_list : list of the receiving facility
commod_list : list of the receiving facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

evaler_ : evaler
send_list : list of the sending facility
rec_list : list of the receiving facility
commod_list : list of the receiving facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

evaler_ : evaler
send_list : list of the sending facility
rec_list : list of the receiving facility
commod_list : list of the receiving facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

evaler_ : evaler
send_list : list of the sending facility
rec_list : list of the receiving facility
commod_list : list of the receiving facility

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

base_col = ['SimId', 'SenderId']
added_col = base_col + ['Prototype']
trans = merge_n_drop(trans, base_col, send_agent, added_col)
trans = trans.rename(index=str, columns={'Prototype': 'SenderProto'})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be better to spell out everything fully (SenderPrototype and ReceiverProtoype on line 127 below)?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full spellings are (I think) fully consistent throughout the cymetric metrics.

"""

if len(nuc_list) != 0:
nuc_list = format_nuclist(nuc_list)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I think if it is called only for the base functions that these are dependent on, you should be good.

nuc_list : list of nuclide to select.
"""

if len(nuc_list) != 0:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that nuc_list is a filtering mechanism....but there is a warning in the ger_reduced_df function about having any of the filtering mechanisms empty: https://github.com/bam241/cymetric/pull/3/files#diff-908097c62f1520cddbe1acee59c44083R74

I think instead of that there should just be no filtering and all nucs/facilities/etc should be returned?

Copy link

@pshriwise pshriwise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nice! I think the number of lines here could be greatly reduced with some refactoring of certain function patterns. Maybe there is a good reason for this though, I don't know.

The testing looks very nice. I didn't see any examples in which you pass data structures that are expected to cause failures. This is something I've found is often overlooked in testing code.


def merge_n_drop(df, base_col, add_df, add_col):
"""
Merge some additionnal columns fram an additionnal Pandas Data Frame

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*additional

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onother -> with another

df : Pandas Data Frame
base_col : list of the base columns names
add_df : Pandas Data Frame to add in the df one
add_col : columns tobe added

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tobe -> to be

add_col : columns tobe added
"""
df = pd.merge(add_df[add_col], df, on=base_col)
df.drop(base_col[1], 1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

base_col is specified as a list. Why only drop the first entry in that list?


Parameters
----------
evaler_ : evaler

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter could use a little more description. It seems to be the data frame to be filtered.

return df


def get_transaction_decayheat_df(evaler_, send_list=[], rec_list=[], commod_list=[], nuc_list=[]):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These get_transaction_*_df functions could probably be refactored or made an additional option to get_transaction_df.

return df


def get_transaction_decayheat_timeserie(evaler_, send_list=[], rec_list=[], commod_list=[], nuc_list=[]):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar refactor to the get_transaction_*_df functions could happen here it seems.

return df


def get_inventory_timeserie(evaler_, fac_list=[], nuc_list=[]):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function name looks incomplete here.

assert_frame_equal(cal, refs)


@dbtest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the purpose of the function decorators here. Maybe you can elaborate on them during the code review meeting?

assert_frame_equal(cal, refs)


if __name__ == "__main__":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. Very convenient.

abachma2 pushed a commit that referenced this pull request Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants