Codecleanup and streamlining #6

hernot · 2021-07-25T13:43:59Z

Could not help doing some streamlining code clean-up and update of documentation. Hopefully bringing the whole closer to actual release.

Details about the tiny bits and polishings you find in the commit message.

I have tested / tox'ed the whole for python 2.7, 3.6, 3.7 and 3.8. For 3.5 i do not have any installation any more and for 3.9 not yet.

And for the extra.py module if this is still integral part of ahds you will have still to write the test suit that is the only one missing for reaching 100% code coverage.

* Cleanup of commented but not yet removed codes. * Using else clause of while and fore where sensible instead of a=None ``` while b: if c: a=c break if a is None: a='' ``` * remodeled tests for data_stream.py, gammar.py and header.py to first test individual classess and their methods before testing complex usage scenaios. Simplifies setupClass method and allows to identify errors more up front and without stepping through cascades of function calls. Added tests/test_03_data_streams_common.py test suite testing functions and classes imported and required by header.py up front. * reduced methods exported by grammar.py to get_header only * added custom parser classes deriverd from simpleparse.parser for AmiraMesh and HyperSurface files. Both share the same constructor interface accepting addtional keywords arguments. The latter are passed by the parse method of the HyperSurfaceParser class to the parse_hypersurface_data method to parse content structure of HyperSurface files after parsing header. This ensures that both parsers return a complete structure of the parsed files. * updated documentation and readme as far as possible. docs/ahds_classes.png still needs to be updated to reflect new class structure. Check if update of documentation is appropriate will be necessary.

- github workflow now uses tox-gh-actions pluging for tox: * all testing including syntax check is now runf from within tox environment. This ensures that most errors can be identified through local tox run and fixed before committing and pushing to github for all locally installed python interpretor versions. Further most errors occuring when github workflow is run can be repoduced through local tox run. The only requirement for this is to ensure that the affected python version is installed locally. - github workflow extended by windows and mac-os builds (experimental) - some minor fixes for future "must break there points" identified through local flake8 run. TODO: need to discuss which of the issures reported by flake8 should be addressed before release

hernot · 2021-08-16T19:30:14Z

People taking an apple, making a bite, putting it on the table and calling it a computer are suspicious.
People staring inside through any window, complaining about everything and thereby pretending to work on the computer are not to be trusted.
Still ahds has from now on builds on python >= 3 for both windows and mac OS-X and python 2.7 for mac OS-X.
Had to modify the github action workflow a bit to ensure that all tests and uploading to coveralls works on all platforms. the ghithub action selected originally only worked on linux and official coveralls githubaction insists to load lcov files which are not generated by pytest-cov module just by coverage directly. Luckily coveralls.py directly supports github and can directly pares .coverage database file. All the details inside the commit message.
Further i changed workflow that all testing including stylecheck through flake8 is run through tox. That allows to do most of the testing for different pythonversions locally before the first push.

paulkorir · 2021-08-31T10:06:22Z

README.rst


 --------------------------------------------
 Installation
 --------------------------------------------
-Presently, ``ahds`` only works with Python 2.7 but will soon work on Python 3. Please begin by


Is it possible to install without numpy? Please try in an empty virtual env.

paulkorir · 2021-08-31T10:07:52Z

README.rst


    pip install ahds

----------------------------------------------
-License			


The license and copyright need to be retained. Could you please add these back?

paulkorir · 2021-08-31T10:09:29Z

README.rst

@@ -112,7 +293,7 @@ The `designation` is the first line and conveys several important details about

 *     extra format data e.g. ``<hxsurface>`` specifying that an AmiraMesh file will contain HyperSurface data

-A series of `definitions` follow that refer to data found in the data pointer sections that either begin with the word ‘define’ or have ‘n’ prepended to a variable. For example:
+        A series of `declarations` defines the sturcture and layout of the data and how many elements the corresponding array, vector or matrix contains. A declaration lline starts with the word â€˜defineâ€™ or in files created by early Amira versions have â€˜nâ€™ prepended. For example:


Is declarations the official term?

good question need to check if they have any term in their reference guide for it. Otherwise i would believe we are free to find an appropriate naming for the part defining the array/matrix size and the part describing each data item therein and thus the data-pointers in detail.

Ok checked in the reference guide for version 5. they talk about arrays and entries

[...] one-dimensional array Nodes containing entries of type float[3] called
Coordinates, as well as a one-dimensional array Tetrahedra containing entries of
type int[4] called Nodes. [...]

and further down they describe the statement defining the array

[...] The statement define Nodes 4 defines a one-dimensional array of size 4.
Later on, this array can be referenced using the name Nodes. Similarly, a state-
ment define Array 100 100 defines a two-dimensional array of size 100 x
100 [...]

and the statement defining the data element / item

The statement Nodes { float[3] Coordinates } = @1 specifies that
for each element of the array Nodes defined earlier, three floating point numbers
(floats) should be stored. These data are given the name Coordinates and a tag
in this line and will appear below as a tagged block with the marker @1

So their wording uses array and element definition and tag, tagged block and marker when they mean @N symbol. So if i would change wording to array definitions and element definitions if we want strictly stick to their wording. Should we also reword the corresponding grammar symbols and variable names to consistent and ease reading and understanding code for future contributors and ourselves when having to amend in one of our next lives?

paulkorir · 2021-08-31T10:13:59Z

docs/ahds.rst

@@ -1,15 +1,32 @@
 ``ahds`` package
 ================

-.. automodule:: ahds
+.. automodule:: src.ahds


I suppose this is for practical purposes i.e. it only works when pointing to src

Yes it had something to do with testing, etc. i would have to check commit message for detailed reasons (hope have put them)

Ok i have checked this was necessary cause in the pull request #4 which you already have merged by the commit 964eff1 i had to change file system structure to consistently be able to reproduce a bug locally using pytest and tox which i encountered on the CI (back then still Travis). The only way back then i found to get hold of the bug without overstressing CI was to change filesystem structure to the one preferred by pytest.
This change now reflects this previous change upon doc sources to ensure sphinx is able to collect the source file documentation for the new file locations.

Added tests ----------- * ahds_readonly_descriptor * ahds_parent_descriptor * ahds_member_descriptor * BlockMetaClass * Python2 specific patches eg. dict Added conftest.py to ensure the essential tests for ahds_readonly_descriptor, ahds_parent_descriptor and BlockMetaClass can be executed even if an error in BlockMetaClass or other essential parts prevents regular import of ahds/core.py. For this it implements a pytest_sessionstart hook which tests if ahds/core.py succeeds. In case import fails than a fallback module is loaded and patched with all the components successfully created before the error occured. The resulting is than placed on sys.modules dictionary instead of the failed module. This ensures that test collection phase of pytests succeeds eventhough ahds/core module can not be imported. In case import is not possible the pytest_collection_finish hook adds to all tests the pytest.mark.xfail mark unless it has explicitly been marked with the run_recover mark which is defined by the pytest_configure hook. Any test to which the run_recover mark is attached will be run even if ahds/core.py can not be imported and is replaced by the patched fallback module. All other tests which are marked xfail will not be run at all. Python 2 dict remapp -------------------- Removed the _dict_iter_items, _dict_iter_keys, and _dict_iter_values exports of the unbound dict.iteritems, dict.iterkeys and dict.itervalues functions. Instead the python 2 specific sectoin defines a dict succlass of builtin.dict that remaps these methods to dict.items, dict.keys and dict values the original dict.items, dict.keys and dict.values methods are mappted to dict.listitems, dict.listkeys and dict.list values. The original builtins.dict class is replaced by this python 3 conformat facade. TODO: it has to be tested whether any of the still used python 2 based production environments where ahds is used requires that dict.items, dict.keys, dict.values methods return an indexable list instead of a DictItemsView, DictKeysView or DictValuesView object. If this is the case the python 3 conformant remapming has to be removed again. Otherwise it will help to motivate developers to stick to python 3 instead of python 2 for new projects and programs. Python 2 range remapp --------------------- remapped builtins.range to builtins.xrange the original range can be accessed through builtins.slow_range. TODO: if any productive software requires that range returns list and not xrange object both of which are indexable this has to be undone again. Some notes capturing ideas on how to write AmiraMesh files ---------------------------------------------------------- Added some notes how writing out valid AmiraMesh files could be acomplisht. Just a first collection of thoughts and ideas for discussion.

hernot · 2021-09-20T16:34:26Z

Ok as Coveralls seems to fight with their database since yesterday and still encounters problems i will retry once again tomorrow. Until upload of reports is again flawlessly possible and than squash all the unsuccessful upload runs. @paulkorir can you check if the github actions settings if retriggering unsuccessful runs can be configured. If what possibilities configuration allows to ensure only by you as maintainer and the submitter of the pull-request. Cause failed runs due to downtime of coveralls can happen anytime again.

hernot · 2021-09-21T20:18:12Z

Ok Coveralls is back up again just right on the spot for pushing update removing numpy build dependency.
Further i have checked the current pull request readds the license statement in the README.rst and docs/overview.rst.

The last commit also includes some fixes for reducing number of code changes necessary on removal of pyhton2. The tricky part about them is they are monkeypatching builtins module to gain python3 compatible interface behavior. The down is this fixes only safely work as long as none of your still python 2 based production environments importing ahds library requires dict.items(), dict.keys(), dict.values() and range return indexable list objects, cause as the dict object is monkey patched to a facade class rewiring dict.items to dict.viewitems, dict.keys to dict.viewkeys and dict.values to dict.viewvalues. The range function is monkey patched to point to the pyton 3 range alike xrange function. Let me know if any of these monkey patche could pose a problem on your systems not yet migrated to python 3. Otherwise as according to roadmap dropping python 2 support is not far ahead anyway i would keep them.

I also have included in the weekend commit some thoughts how writing AmiraMesh files could work. The idea is to have some start for discussion and design phase.
EDIT'
Just checked a consideration (idea) which came to my mind this morning. I do think counting formatting support form python we already have about 80 % we do need for writing it needs just to refactor the _load_declaratoins and _load_definitions functions and to find an appropriate way to generate parameters section. With this also the question how should a simple interface for creating valid ahds object structure which can be converted into a valid amira mesh file and in future possibly also a valid hxsurface file. Will update the list of my thoughts the next days with this considerations.

EDIT
fix some minor issue in GitHub Actions workflow file causing Windows runners to upload empty coverage report.

For building the decoders module numpy is not necessary any more. The same holds for the hxbyterle_decoder fallback python function. Both now return a bytearray object which can be passed to np.frombuffer for converting it into a numpy array. The both functions as well as the raw decode function now mimick the call signature of zlib.decompress which is used to decode HxZip encoded data. Github Actions update: ---------------------- On windows runners empty coverage reports were uploaded leaving the impression that not any line was tested eventhough pytest reported successful completion and that all lines in all files have been tested exempt extra.py for which up to now no tests exist. This issue is fixed by forcing GitHub action runners to use bash as shell in perparatory coveralls upload steps: Coveralls python 2.7 preconditions Coveralls recent python preconditions instead of default shell which would be power shell on windows runners and bash for all others. According to documentation legacy powershell does not apply proper utf8 encoding which is precondition for successfully appending environment variable to $GITHUB_ENV to be used in subsequents steps. By enforcing use of bash according to documentation selects on Windows runners bash bundled with git for Windows instead of default pwsh (power shell).

…eclaratoins and AmiraHeader._load definitions already build structure according to rules defined by loaded file

… name; fixed stream handle swallowing final byte(s) of stream when their value is equal to b'\n' character; fixed missing verbous flag check; tests still missing

hernot · 2021-10-20T14:28:26Z

While trying to use the forthcomming version in my productive work (testing with real world data) i came across a rather strange HxSpreadsheet file with column names which do not include a column number counter and have all the same name. That seemed at least in the past a valid AmiraMesh file. Therefore i'm currently refactoring my proposal on how Specialised Contentypes like HxSpreadSheet could be more reliably supported and hopefully than again written to file such that amira is able to read them properly. I'm trying to come up with a simpler, better maintainable and thereby less complex solution. Have already an Idea which I'm currently testing. Will need some days. Let you know when finished.

While testing with old real data generated by Amira a reather unusually formatted header caused problems properly loading the data. The header of the 'tests/data/EqualLabeledColumnsSpreadSheet.am' file defines 9 arrays with exactly the same name 'AT' and it also defines nine data streams with exactly the same name 'AT'. Only in the parameter section it can be deduced that each of the 9 arrays represents one '__Column' as the '__ColumnNameXXXX' parameters still read '__ColumnName0000' through to '__ColumnName0009'. With all 9 arrays and corresponding data haveing the same name only the last array was added to the AmiraHeader structure. All others were droped. This is solved by refactoring the dispatch time filtering of array_definitions for collecting the array definitions which could be part of a dedicated special 'ContentType' like a 'HxSpreadSheet'. The filtering is now only applied if the special 'ContentType' parameter is encountered while parsing the Parameters section of the file. The main purpose of 'DispatchFilter' objects usde for filtering is two fold. 1) Indicate whether the content of the file needs a special handling during setup of the AmiraHeader File Structure 2) Filter and rewrite the names of the arrays and the correponding data_names where necessary and update the affected low level array_defintion objects and data_defintion objects accordingly The benefit of this refactoring is that the dispatch time filtering became more stable and the interface could be further simplified.

hernot · 2021-10-21T20:45:22Z

Ok refactoring done. Nice effect interface for dispatch time prefiltering and if necessary recoding array_names, array_references and data_names became clearer and the code to handle the included HxSpreadSheet example became further smaller and cleaner.
EDIT
After fixing the small bug in the regular expressions used to identify the start HxSurface streams all works again. Thereby i have added to the Py23FixTestcase lass a static method called parametrize. Like the pytest's own parametrize function it allows to define a test function which is called multiple times with different parameters. The reason for this home-brew version is that pytest.parametrize is not supported by unittest module and unittest module does not have such. I didn't want to write the very same test code five times just to test the two stream header start regular expressions for "AmiraMesh" and "HxSurface" files with various 'ascii' and 'binary' files.

While testing with old real data generated by Amira a reather unusually formatted header caused problems properly loading the data. The header of the 'tests/data/EqualLabeledColumnsSpreadSheet.am' file defines 9 arrays with exactly the same name 'AT' and it also defines nine data streams with exactly the same name 'AT'. Only in the parameter section it can be deduced that each of the 9 arrays represents one '__Column' as the '__ColumnNameXXXX' parameters still read '__ColumnName0000' through to '__ColumnName0009'. With all 9 arrays and corresponding data haveing the same name only the last array was added to the AmiraHeader structure. All others were droped. This is solved by refactoring the dispatch time filtering of array_definitions for collecting the array definitions which could be part of a dedicated special 'ContentType' like a 'HxSpreadSheet'. The filtering is now only applied if the special 'ContentType' parameter is encountered while parsing the Parameters section of the file. The main purpose of 'DispatchFilter' objects usde for filtering is two fold. 1) Indicate whether the content of the file needs a special handling during setup of the AmiraHeader File Structure 2) Filter and rewrite the names of the arrays and the correponding data_names where necessary and update the affected low level array_defintion objects and data_defintion objects accordingly The benefit of this refactoring is that the dispatch time filtering became more stable and the interface could be further simplified.

…ing in case it can not be excluded that files are opened for loading.

hernot added 2 commits July 25, 2021 15:25

hernot force-pushed the dev branch from d6a4c19 to 101c361 Compare August 16, 2021 19:17

paulkorir reviewed Aug 31, 2021

View reviewed changes

hernot force-pushed the dev branch from ceb66f7 to e05fcde Compare September 22, 2021 19:01

hernot force-pushed the dev branch from e05fcde to 0b8bdb1 Compare September 22, 2021 19:46

hernot added 3 commits September 24, 2021 18:28

updated writing notes considering that on reading AmiraHeader._load_d…

030544e

…eclaratoins and AmiraHeader._load definitions already build structure according to rules defined by loaded file

fixed memory leak due to non necessary Py_INCREF

f4120aa

marked Python 2.7 support as deprecated, to be removed soon

cdb6e79

hernot force-pushed the dev branch from cdb6e79 to 011bfe6 Compare September 28, 2021 21:06

marked Python 2.7 support as deprecated, to be removed soon

8b22b7b

hernot force-pushed the dev branch from 011bfe6 to 8b22b7b Compare October 1, 2021 15:39

hernot added 3 commits October 12, 2021 21:54

extended todo list

7943057

some more thoughts about writing

e709cc8

fixed parameter values which have dedicated name field, updates block…

5911b86

… name; fixed stream handle swallowing final byte(s) of stream when their value is equal to b'\n' character; fixed missing verbous flag check; tests still missing

hernot force-pushed the dev branch from 526c98f to 020b513 Compare October 21, 2021 20:35

hernot force-pushed the dev branch from 020b513 to 9dc609c Compare October 22, 2021 08:11

hernot force-pushed the dev branch from 9dc609c to 40cad8e Compare October 23, 2021 16:39

Added alternative idea for writing especially how to handle over writ…

58443b4

…ing in case it can not be excluded that files are opened for loading.

paulkorir approved these changes Jan 28, 2022

View reviewed changes

paulkorir merged commit ac19eac into emdb-empiar:dev Jan 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codecleanup and streamlining #6

Codecleanup and streamlining #6

hernot commented Jul 25, 2021 •

edited

hernot commented Aug 16, 2021 •

edited

paulkorir Aug 31, 2021

paulkorir Aug 31, 2021

paulkorir Aug 31, 2021

hernot Sep 1, 2021

hernot Sep 21, 2021

paulkorir Aug 31, 2021

hernot Sep 1, 2021

hernot Sep 20, 2021 •

edited

hernot commented Sep 20, 2021

hernot commented Sep 21, 2021 •

edited

hernot commented Oct 20, 2021

hernot commented Oct 21, 2021 •

edited

Codecleanup and streamlining #6

Codecleanup and streamlining #6

Conversation

hernot commented Jul 25, 2021 • edited

hernot commented Aug 16, 2021 • edited

paulkorir Aug 31, 2021

Choose a reason for hiding this comment

paulkorir Aug 31, 2021

Choose a reason for hiding this comment

paulkorir Aug 31, 2021

Choose a reason for hiding this comment

hernot Sep 1, 2021

Choose a reason for hiding this comment

hernot Sep 21, 2021

Choose a reason for hiding this comment

paulkorir Aug 31, 2021

Choose a reason for hiding this comment

hernot Sep 1, 2021

Choose a reason for hiding this comment

hernot Sep 20, 2021 • edited

Choose a reason for hiding this comment

hernot commented Sep 20, 2021

hernot commented Sep 21, 2021 • edited

hernot commented Oct 20, 2021

hernot commented Oct 21, 2021 • edited

hernot commented Jul 25, 2021 •

edited

hernot commented Aug 16, 2021 •

edited

hernot Sep 20, 2021 •

edited

hernot commented Sep 21, 2021 •

edited

hernot commented Oct 21, 2021 •

edited