Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Spring cleaning (Treant unification, no uuids, .datreant state) #145

Merged
merged 43 commits into from
Nov 14, 2017

Conversation

dotsdl
Copy link
Member

@dotsdl dotsdl commented Apr 24, 2017

fixes #100
fixes #101
fixes #135
fixes #65

A couple weeks ago @kain88-de and @richardjgowers and I met to discuss some changes I was considering for datreant. These changes include removal of uuids (#135), removal of the concept of different Treant types from the whole library (goes toward #100), and the replacement of a bare state file with a .datreant directory indicating a Treant.

I'd like to store tags and categories in separate files under this scheme, which will require more hacking. Basically, instead of a limb such as Tags storing all its stuff with any other limb's stuff in the same file, it can do what it needs separately. Not sure if this idea will be something we like but toying with it still.

Once the dust settles I'll also add a converter script for putting an existing Treant into the new form.

dotsdl added 2 commits May 5, 2017 13:36
There is no state file for the overall Treant anymore. Now, what makes a
Treant is the presence of the `.datreant` directory alone. Limbs make
whatever files they need inside there to function.
All tests pass again, so we shouldn't have any obvious regressions.
Now need to add tests that Treants are as hardy as they should be with
new machinery.

@property
def _treantfile(self):
return os.path.join(self.abspath, treantdir_name, treantfile_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return os.path.join(self._treantdir, treantfile_name)


"""
return self._backend.filename
return Path(self._backend.get_location()).absolute().parent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not Path(self._treantdir).absolute().parent?

Copy link
Contributor

@kain88-de kain88-de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far

@@ -0,0 +1,2 @@
treantdir_name = '.datreant'
treantfile_name = 'state.json'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't globals we written in all caps

These aren't tested yet, so much to do still.
@dotsdl
Copy link
Member Author

dotsdl commented May 28, 2017

Status update: this will be a big PR, essentially being a refresh of much of the core library. I went through the list of open issues last night, and the changes we're making here will make many of our older issues either solved or irrelevant. This is similar in effect (though not in scale) to the MDAnalysis topology system refactor, and I'm shooting for consistency throughout the library.

Hang tight; more to come. :D

@codecov-io
Copy link

codecov-io commented May 28, 2017

Codecov Report

Merging #145 into develop will increase coverage by 1.93%.
The diff coverage is 80.29%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #145      +/-   ##
===========================================
+ Coverage     77.7%   79.64%   +1.93%     
===========================================
  Files           14       12       -2     
  Lines         1624     1243     -381     
  Branches       354      283      -71     
===========================================
- Hits          1262      990     -272     
+ Misses         306      206     -100     
+ Partials        56       47       -9
Impacted Files Coverage Δ
src/datreant/core/__init__.py 100% <ø> (ø) ⬆️
src/datreant/core/util.py 100% <100%> (ø) ⬆️
src/datreant/core/exceptions.py 100% <100%> (ø)
src/datreant/core/names.py 100% <100%> (ø)
src/datreant/core/manipulators.py 87.5% <100%> (-1.08%) ⬇️
src/datreant/core/state.py 71.81% <100%> (ø)
src/datreant/core/collections.py 67.68% <61.41%> (-0.29%) ⬇️
src/datreant/core/trees.py 84.31% <66.66%> (+0.38%) ⬆️
src/datreant/core/treants.py 74.11% <71.42%> (-9.56%) ⬇️
src/datreant/core/metadata.py 90.36% <94.08%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dca253e...5a2d993. Read the comment docs.

@kain88-de
Copy link
Contributor

@dotsdl any progress on this?

@dotsdl
Copy link
Member Author

dotsdl commented Jun 24, 2017

@kain88-de been working on-and-off on it. Have a new application to build for work that I think will be MDSynthesis-like, in that it uses Treant as a base-class for a domain-specific object. We use datreant pretty extensively for getting things done there, but I don't have a lot of spare cycles at the moment to work directly on it.

I think when @richardjgowers is here for a week I'll find some motivation to just finish it. Standby. :D

This will allow to drop the wrapper once the library is py3 only.
@kain88-de
Copy link
Contributor

kain88-de commented Jul 21, 2017

funny the code now works on python 3.6 and only that version

@kain88-de
Copy link
Contributor

The error seems to be with the multiprocessing module. I don't have any experience with it myself. @dotsdl would be nice if you could have a lock

@kain88-de
Copy link
Contributor

OK I fixed the python 2.7 errors I found as well. But really no clue about the multiprocessing errors. It looks like the processes never finish

setup.py Outdated
@@ -32,5 +32,5 @@
license='BSD',
long_description=open('README.rst').read(),
tests_require = ['numpy', 'pytest>=2.10', 'mock'],
install_requires=['asciitree', 'pathlib2', 'scandir', 'six', 'fuzzywuzzy']
install_requires=['asciitree', 'pathlib2', 'scandir', 'six', 'fuzzywuzzy', 'six']

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'six' appears twice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks I overlooked that

Tree.trees, Tree.leaves, Tree.children were once properties, but since
these things really could benefit by optionally returning a View with
hidden files/directories, we made them methods instead. This also gives
some indication that work has to be done to call these, in that it
involves a filesystem call.

We have gotten rid of .hidden. Also, .trees, .leaves, and .children have
been added to Bundle since these are exposed to Treants.
Small changes to make this clearer, even if not all limbs are suitable
for Tree objects.
@dotsdl
Copy link
Member Author

dotsdl commented Jul 31, 2017

@orbeckst don't worry...got Python 2.7 working again. Was a small issue with how pickling was being handled in 2 vs. 3. We won't be dropping support at this time.

@kain88-de
Copy link
Contributor

@dotsdl and @richardjgowers do you have a list of what is still left todo?

from .trees import Tree, Leaf
from .names import treantdir_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still for using all caps on constants


"""
_classagglimbs = set()
_agglimbs = set()

def __init__(self, *vegs, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kwargs are unused now. Still needed ?

def __init__(self, collection):
self._collection = collection
@staticmethod
def _init_state(jsonfile):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this function really needed? I don't see the benefit of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also confused here why this function is static. It is only ever called with self as argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is the init_state method specific to Tags, which gets fed to the JSONFile constructor for use when creating a file from scratch. For Tags this generates an empty list. Categories has its own that generates an empty dict.

super(JSONFile, self).__init__(filename, **kwargs)

# set _init_state function that given
self._init_state = init_state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the object self._init_state is nowhere really called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._init_state gets called by Tags and Categories classes, which use JSONFile to talk to their state files. Basically, it's defined at runtime on creation of the JSONFile object.

I'll admit it's a confusing way to go about it. I'm open to alternatives.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but where is it called? It should also be commented what the intent of this. Otherwise, I'm gonna ask again in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a brief docstring to each with this information. Is there more needed, though?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope clearer now. A comment here may be nice. But I don't remember exactly why I didn't understand this. Reading the code now it's clear to me.

"""Return a set giving the Tags in `a` that are not in `b`.

"""
from .metadata import AggTags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is cool. You are saying import class from this file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha...that's an artifact from before when these classes where in a separate file. I can probably get rid of it.

Added thee upon access of tags or categories since these throw
exceptions already upon access. We don't want to do a check for
`.datreant` on every access of Treant since this will have a potentially
high performance hit for many Treants. Basically, we only raise the
issue when it really matters.
@dotsdl
Copy link
Member Author

dotsdl commented Nov 14, 2017

Finished updating docs. Next up is to write a simple converter script for old Treants to new form.

@kain88-de
Copy link
Contributor

Finished updating docs. Next up is to write a simple converter script for old Treants to new form.

@mimischi there is gonna be a script, I hope I can adjust it to work on a Sim as well.

docs/treants.rst Outdated
This means that changing the location or name of a Treant can be done at the
filesystem level. Although this means that one can change the treanttype and
uuid as well, this is generally not recommended.
place on disk; they store almost nothing in memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also point out that we only keep the filehandle alive as shortly as possible. It's good on rare occasions to have this documented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Added.

@dotsdl
Copy link
Member Author

dotsdl commented Nov 14, 2017

If we're happy with this, I'm good with merging (finally). Next up will be getting the other PRs that have been waiting in the wings finally in as well.

@kain88-de
Copy link
Contributor

sure merge ahead

@dotsdl dotsdl merged commit 830bef2 into develop Nov 14, 2017
@kain88-de kain88-de deleted the spring_cleaning branch November 21, 2017 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants