-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance, Design and other fixes #100
Conversation
Codecov Report
@@ Coverage Diff @@
## master #100 +/- ##
==========================================
+ Coverage 97.35% 97.47% +0.12%
==========================================
Files 56 56
Lines 5102 5119 +17
==========================================
+ Hits 4967 4990 +23
+ Misses 135 129 -6
Continue to review full report at Codecov.
|
We talked about this offline, but just to document it: We also need to make sure custom properties' ordering works if they are passed to the constructor via a |
Ignore this, I was looking at master and didn't see how this PR changed the behavior. |
stix2/base.py
Outdated
# The custom properties will get added to the bottom. | ||
# Matched with a CustomProperty. | ||
extra_kwargs = list(set(kwargs) - set(cls._properties)) | ||
self._properties.update([(x, CustomProperty()) for x in extra_kwargs]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to do this. It updates the class instance of _properties
, meaning that any custom properties on the first instantiation of an SDO are allowed on any future instance, even if allow_custom is false for the future ones.
It's perhaps likely that people want to use the same custom properties on every instance of the same SDO that they create, but I'm not sure how I feel about the way this is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point! So, could we (possible approaches):
- Make _properties an instance variable?
- Handle custom in a completely different structure?
- Go back with the old method (required checking the _inner dict)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go back to the #85 fix. That was working fine.
I'd like to figure out better ways to track the version history of the Jupyter notebooks. I noticed that a couple switched from Python 2 to Python 3 (which is OK, but I just want to make sure what we do is deliberate). @clenk , @mbastian1135 any ideas? |
stix2/sources/__init__.py
Outdated
self.id = make_id() | ||
self.source = source | ||
self.sink = sink | ||
|
||
def get(self, stix_id, allow_custom=False): | ||
@abstractmethod | ||
def get(self, stix_id): # pragma: no cover |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than making these abstract methods, can we just define them similar to get(self, *args, **kwargs)
, and pass them on the source? I'd rather not require subclasses to override these.
@gtback The notebooks will always switch Python version when someone re-executes cells in them unless we all use the exact same version of Python. I think we could use a git filter to ignore the lines about python version. See https://stackoverflow.com/q/16244969/ We could have git ignore the If you're referring to tracking changes in the content of output cells instead, there are some ways to handle that too: http://nbsphinx.readthedocs.io/en/0.2.16/usage.html#Using-Notebooks-with-Git. |
@clenk: I just want to make sure we don't accidentally introduce extraneous diffs in the Notebook JSON files (output, Python version, execution count, etc.). That link to the nbsphinx output is extremely useful, but do we really want to not pre-run any of the cells? Will that even work on ReadTheDocs? (sorry, more questions than answers, I know) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There just seem to be a lot of intrusive changes to the data sources; to me, it's a code smell that we need to refactor this all a bit. I wish there were an easier way to handle these functions. Let's talk about this a bit more.
stix2/sources/__init__.py
Outdated
|
||
|
||
class DataSink(object): | ||
class DataSink(with_metaclass(ABCMeta)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having DataSink and DataSource as abstract classes is OK. I just think DataStore can provide a lot of default functionality (delegating to source
and sink
attributes.
stix2/sources/filesystem.py
Outdated
@@ -35,6 +35,85 @@ def __init__(self, stix_dir, bundlify=False): | |||
self.source = FileSystemSource(stix_dir=stix_dir) | |||
self.sink = FileSystemSink(stix_dir=stix_dir, bundlify=bundlify) | |||
|
|||
def get(self, stix_id, allow_custom=False, version=None, _composite_filters=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this is specific to the FileSystem, I don't think. I'm hoping we can suitably abstract it with *args
and **kwargs
in the base class.
As the Jupyter notebook versioning discussion seems tangential to this PR, I've moved it to #101. |
…ts. Remove CustomProperty since it is no longer needed
Hopefully I have addressed the concerns with the DataStores. It looks a lot cleaner now. Note: I did not change anything related to Jupyter Notebook since there is a new issue open. If nothing else needs to be fixed i'd say its ready to be merged! 👍 |
The major changes include:
serialize()
method to add flexibility for performance. You can pass the same arguments you would for ajson.dumps()
call. For example,sort_keys=True
,indent=4
, etc.CustomProperty
class, which developers can extend from to create their custom properties. It is wrapper, but it helps to signal the usage of custom properties. This fixes the problem when an object is parsed withallow_custom=True
but the _properties dictionary would not get updated.DataStore
,DataSink
andDataSource
. I think the changes made are good for proper concrete implementations of these ABCs. For example, DataStore was never abstract...closes #99
closes #98
closes #96