Performance, Design and other fixes #100

emmanvg · 2017-11-03T12:51:51Z

The major changes include:

A new serialize() method to add flexibility for performance. You can pass the same arguments you would for a json.dumps() call. For example, sort_keys=True, indent=4, etc.
Created a CustomProperty class, which developers can extend from to create their custom properties. It is wrapper, but it helps to signal the usage of custom properties. This fixes the problem when an object is parsed with allow_custom=True but the _properties dictionary would not get updated.
ABC for DataStore, DataSink and DataSource. I think the changes made are good for proper concrete implementations of these ABCs. For example, DataStore was never abstract...

closes #99
closes #98
closes #96

… objects

…w_custom=True

codecov-io · 2017-11-03T12:59:22Z

Codecov Report

Merging #100 into master will increase coverage by 0.12%.
The diff coverage is 94.52%.

@@            Coverage Diff             @@
##           master     #100      +/-   ##
==========================================
+ Coverage   97.35%   97.47%   +0.12%     
==========================================
  Files          56       56              
  Lines        5102     5119      +17     
==========================================
+ Hits         4967     4990      +23     
+ Misses        135      129       -6

Impacted Files	Coverage Δ
stix2/test/test_bundle.py	`100% <100%> (ø)`	⬆️
stix2/test/test_memory.py	`100% <100%> (ø)`	⬆️
stix2/sources/memory.py	`100% <100%> (ø)`	⬆️
stix2/sources/__init__.py	`100% <100%> (+3.77%)`	⬆️
stix2/test/test_identity.py	`100% <100%> (ø)`	⬆️
stix2/base.py	`100% <100%> (ø)`	⬆️
stix2/test/test_data_sources.py	`100% <100%> (ø)`	⬆️
stix2/test/test_filesystem.py	`100% <100%> (ø)`	⬆️
stix2/sources/filesystem.py	`98.16% <100%> (-0.04%)`	⬇️
stix2/sources/taxii.py	`34.17% <33.33%> (+0.84%)`	⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5aa8c66...258ea4b. Read the comment docs.

clenk · 2017-11-03T13:20:54Z

We talked about this offline, but just to document it: We also need to make sure custom properties' ordering works if they are passed to the constructor via a custom_properties dictionary instead of allow_custom. Like the 2nd example on https://stix2.readthedocs.io/en/latest/guide/custom.html#Custom-Properties.

gtback · 2017-11-03T15:13:22Z

@clenk: Is there a reason you think they would be handled differently? Everything gets collapsed into the same dictionary, whether they came from custom_properties dict or a standard kwarg (when allow_custom=True). It's interesting (and probably not desirable) that the custom properties come first in both examples on the Jupyter notebook you linked to.

Ignore this, I was looking at master and didn't see how this PR changed the behavior.

gtback · 2017-11-03T15:17:14Z

stix2/base.py

+            # The custom properties will get added to the bottom.
+            # Matched with a CustomProperty.
+            extra_kwargs = list(set(kwargs) - set(cls._properties))
+            self._properties.update([(x, CustomProperty()) for x in extra_kwargs])


I'm not sure we want to do this. It updates the class instance of _properties, meaning that any custom properties on the first instantiation of an SDO are allowed on any future instance, even if allow_custom is false for the future ones.

It's perhaps likely that people want to use the same custom properties on every instance of the same SDO that they create, but I'm not sure how I feel about the way this is done.

That is a good point! So, could we (possible approaches):

Make _properties an instance variable?

Handle custom in a completely different structure?

Go back with the old method (required checking the _inner dict)?

I'll go back to the #85 fix. That was working fine.

…ngly

gtback · 2017-11-07T19:26:33Z

I'd like to figure out better ways to track the version history of the Jupyter notebooks. I noticed that a couple switched from Python 2 to Python 3 (which is OK, but I just want to make sure what we do is deliberate). @clenk , @mbastian1135 any ideas?

gtback · 2017-11-07T19:30:58Z

stix2/sources/__init__.py

        self.id = make_id()
        self.source = source
        self.sink = sink

-    def get(self, stix_id, allow_custom=False):
+    @abstractmethod
+    def get(self, stix_id):  # pragma: no cover


Rather than making these abstract methods, can we just define them similar to get(self, *args, **kwargs), and pass them on the source? I'd rather not require subclasses to override these.

clenk · 2017-11-07T20:57:12Z

@gtback The notebooks will always switch Python version when someone re-executes cells in them unless we all use the exact same version of Python. I think we could use a git filter to ignore the lines about python version. See https://stackoverflow.com/q/16244969/ We could have git ignore the execution_count lines as well.

If you're referring to tracking changes in the content of output cells instead, there are some ways to handle that too: http://nbsphinx.readthedocs.io/en/0.2.16/usage.html#Using-Notebooks-with-Git.

gtback · 2017-11-07T22:07:48Z

@clenk: I just want to make sure we don't accidentally introduce extraneous diffs in the Notebook JSON files (output, Python version, execution count, etc.). That link to the nbsphinx output is extremely useful, but do we really want to not pre-run any of the cells? Will that even work on ReadTheDocs? (sorry, more questions than answers, I know)

gtback

There just seem to be a lot of intrusive changes to the data sources; to me, it's a code smell that we need to refactor this all a bit. I wish there were an easier way to handle these functions. Let's talk about this a bit more.

gtback · 2017-11-07T22:09:28Z

stix2/sources/__init__.py



-class DataSink(object):
+class DataSink(with_metaclass(ABCMeta)):


Having DataSink and DataSource as abstract classes is OK. I just think DataStore can provide a lot of default functionality (delegating to source and sink attributes.

gtback · 2017-11-07T22:12:07Z

stix2/sources/filesystem.py

@@ -35,6 +35,85 @@ def __init__(self, stix_dir, bundlify=False):
        self.source = FileSystemSource(stix_dir=stix_dir)
        self.sink = FileSystemSink(stix_dir=stix_dir, bundlify=bundlify)

+    def get(self, stix_id, allow_custom=False, version=None, _composite_filters=None):


None of this is specific to the FileSystem, I don't think. I'm hoping we can suitably abstract it with *args and **kwargs in the base class.

clenk · 2017-11-08T14:13:42Z

As the Jupyter notebook versioning discussion seems tangential to this PR, I've moved it to #101.

…issue-98

…ts. Remove CustomProperty since it is no longer needed

emmanvg · 2017-11-08T18:56:30Z

Hopefully I have addressed the concerns with the DataStores. It looks a lot cleaner now. Note: I did not change anything related to Jupyter Notebook since there is a new issue open. If nothing else needs to be fixed i'd say its ready to be merged! 👍

emmanvg added 5 commits November 2, 2017 21:29

ABC for DataSink, DataStore and DataSource. Fixes across the concrete…

1e591a8

… objects

Update tests related to Datastores

cd6c798

Define CustomProperty. Make sure to update _properties dict when allo…

8c2af81

…w_custom=True

Extend object serialization options for _STIXBase

d6c1413

Update TAXII datastores

d31b110

gtback reviewed Nov 3, 2017

View reviewed changes

emmanvg added 3 commits November 3, 2017 14:17

Revert object_properties() to oasis-open#85 fix. Update tests accordi…

7947558

…ngly

Update Jupyter Notebook documentation

6f1ab52

Test other CompositeDataSource operations

0d5859b

emmanvg force-pushed the issue-98 branch from 1efeee8 to 0d5859b Compare November 3, 2017 19:13

Remove stray print()

19818c8

gtback reviewed Nov 7, 2017

View reviewed changes

gtback suggested changes Nov 7, 2017

View reviewed changes

clenk mentioned this pull request Nov 8, 2017

Investigate Jupyter notebook version history #101

Closed

emmanvg added 2 commits November 8, 2017 10:31

Merge branch 'master' of github.com:oasis-open/cti-python-stix2 into …

489e45a

…issue-98

Make DataStore a regular class, remove unwanted overrides, update tes…

da66f10

…ts. Remove CustomProperty since it is no longer needed

Some cleanup in MemoryStore

258ea4b

gtback approved these changes Nov 8, 2017

View reviewed changes

gtback merged commit ef6dade into oasis-open:master Nov 8, 2017

gtback added this to the 0.4 milestone Jun 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance, Design and other fixes #100

Performance, Design and other fixes #100

emmanvg commented Nov 3, 2017

codecov-io commented Nov 3, 2017 •

edited

clenk commented Nov 3, 2017

gtback commented Nov 3, 2017 •

edited

gtback Nov 3, 2017

emmanvg Nov 3, 2017

emmanvg Nov 3, 2017

gtback commented Nov 7, 2017

gtback Nov 7, 2017

clenk commented Nov 7, 2017

gtback commented Nov 7, 2017

gtback left a comment

gtback Nov 7, 2017

gtback Nov 7, 2017

clenk commented Nov 8, 2017

emmanvg commented Nov 8, 2017



		class DataSink(object):
		class DataSink(with_metaclass(ABCMeta)):

Performance, Design and other fixes #100

Performance, Design and other fixes #100

Conversation

emmanvg commented Nov 3, 2017

codecov-io commented Nov 3, 2017 • edited

Codecov Report

clenk commented Nov 3, 2017

gtback commented Nov 3, 2017 • edited

gtback Nov 3, 2017

Choose a reason for hiding this comment

emmanvg Nov 3, 2017

Choose a reason for hiding this comment

emmanvg Nov 3, 2017

Choose a reason for hiding this comment

gtback commented Nov 7, 2017

gtback Nov 7, 2017

Choose a reason for hiding this comment

clenk commented Nov 7, 2017

gtback commented Nov 7, 2017

gtback left a comment

Choose a reason for hiding this comment

gtback Nov 7, 2017

Choose a reason for hiding this comment

gtback Nov 7, 2017

Choose a reason for hiding this comment

clenk commented Nov 8, 2017

emmanvg commented Nov 8, 2017

codecov-io commented Nov 3, 2017 •

edited

gtback commented Nov 3, 2017 •

edited