-
Notifications
You must be signed in to change notification settings - Fork 0
Provenance tracking #4
Comments
the initial email from Brian: ...
|
Pete's reply:
|
Pete also wrote about the typical git workflow for collaborators to contribute: In git, you would:
|
Brian's response to "from where did this code come"?
Also:
|
Pete commented on Brian's code example:
|
Brian wrote to Doga:
|
provenanceTracker.py '''Code to record provenance information for a Python app
This assumes that all significant imports have been done before
routine provenanceTracker.provenanceTracker() is called.
'''
import sys
import platform
__version__ = '0.0.1'
def provenanceTracker():
'''Provides a dict listing versions of imported packages
:returns: dict where key is name of package and value is the
__version__ string or for a few known outliers some other variable
that indicates the version.
'''
PackageVersions = {}
PackageVersions['Python'] = sys.version.split()[0]
PackageVersions['Platform'] = sys.platform+'|'+platform.architecture()[0]+'|'+platform.machine()
for name,pkg in sys.modules.iteritems():
try:
PackageVersions[name] = pkg.__version__
continue
except AttributeError:
pass
# deal with a few known ideosyncratic packages
if name == 'Image':
PackageVersions[name] = pkg.VERSION
elif name == 'PIL':
PackageVersions[name] = pkg.PILLOW_VERSION
return PackageVersions
# test this by calling it directly
if __name__ == '__main__':
import provenanceTracker
import matplotlib as mpl
import sys
import PIL
import numpy
import Image
provenance = provenanceTracker.provenanceTracker()
for p in sorted(provenance): print p,provenance[p] |
Brian wrote back:
|
Pete responds to Brian: It all depends on how it is intended to be used. |
Doga Gursoy (welcome to the discussion) wrote: For prototyping this is also an easy way: https://cookiecutter.readthedocs.org Specifically: https://github.com/audreyr/cookiecutter-pypackage |
Now that the discussion is up to date, I'll continue... What is the intended purpose of the addition of provenance code in this prototype?
Will this method be called more than once each time the python package is used? |
CookieCutter looks like a very useful tool. Different, I believe, than Brian's idea of provenance tracking. Thinking of that old lesson: do the fishing or teach the fishing, the CookieCutter project and The PyPrototype project demonstrates the layout of a prototypical Python project. It is on the end of teaching how to fish. There will be some find/replace work to change each new copy of the prototype into a useful new project. Maybe that's too much work. CookieCutter is very much on the end where the fishing is actually done. One uses it to create a new skeleton Python project with all the right names and such (or some other metaphorical cookie shape) according to a customized template. Consider this: the PyPrototype project shows the pattern of the end result. |
I see this as allowing people to recreate the code environment that gave a particular result. I am assuming that thanks to versioneer, one knows what version the current code one is running (perhaps that should be integrated in provenanceTracker.py.) but if a result is changed by for example a change in numpy, how does one track/recreate that? I envision calling the one function in provenanceTracker.py before saving output. By including the returned dictionary one would document as much as possible of the software stack. |
Can we adopt PEP8 standards for the new projects? https://www.python.org/dev/peps/pep-0008/ Two things I have noticed: |
Some pep8 standards but not all of them. For example, errors on "E221 multiple spaces before operator" are just goofy. Sometimes we humans want to line up the equal signs in a block of assignments (such as init.py). Trailing whitespace on a line (W291) is benign Mostly, pep8 is advisory but should be taken with a healthy skepticism. |
I hate W291 Is there a way to configure in automatic checking of some PEP8 standards? |
|
much more valuable feedback and coaching from this tool than pep8 |
https://codeclimate.com is also useful and does this and some other error and readability checks for you. |
@nicholas-aps does any project you've been involved use provenance tracking for data processing? Any ideas or suggestions? |
One project (of which I am aware) actively tracks provenance: OK, I take that back. The Irena IgorPro macros maintained by Jan Here, the provenance is recorded as data values in "wavenotes" (metadata Irena: Otherwise, it has been discussed in two data standards projects: The most progress of these two was to assert the desire and importance :NXprocess: :NXnote: Pete On 3/31/2016 11:33 AM, Doga Gursoy wrote:
|
Brian Toby started an email discussion on Provenance tracking in Python. He included code that would be an enhancement for this prototype project.
To contribute that code, make a pull request.
Here follows the discussion...
The text was updated successfully, but these errors were encountered: