Merge branch 'master' of https://github.com/gb119/Stoner-PythonCode

Allow libmagic to be optional dependency Bump version to 0.6.0b5
stonerlab · Feb 10, 2016 · 9ebfcfc · 9ebfcfc
2 parents e507cfa + 66d0644
commit 9ebfcfc
Show file tree

Hide file tree

Showing 8 changed files with 207 additions and 169 deletions.
diff --git a/README.md b/README.md
@@ -1,151 +1,155 @@
-Introduction
-============
-
-The *Stoner* Python package is a set of utility classes for writing data
-analysis code. It was written within the Condensed Matter Physics group
-at the University of Leeds as a shared resource for quickly writing
-simple programs to do things like fitting functions to data, extract
-curve parameters and churn through large numbers of small text data
-files.
-
-For a general introduction, users are referred to the Users Guide, which
-is part of the [online documentation](http://pythonhosted.org/Stoner/)
-along with the API Reference guide. The [github
-repository](http://www.github.com/gb119/Stoner-PythonCode/) also
-contains some example scripts.
-
-Getting this Code
-=================
-
-The *Stoner* package requires numpy \>=1.8, scipy \>=0.14, matplotlib
-\>=1.4, h5py, numba lmfit and blist. Experimental code also makes use of
-the Enthought Tools Suite packages.
-
-Ananconda Python (and probably other scientific Python distributions)
-include nearly all of the dependencies, aprt from lmfit. However, this
-can by installed with the usual tools such as *easy\_install* or *pip*.
-
-``` {.sourceCode .sh}
-easy_install lmfit
-```
-
-The easiest way to install the Stoner package is via seuptools'
-easy\_install
-
-``` {.sourceCode .sh}
-easy_install Stoner
-```
-
-This will install the Stoner package and any missing dependencies into
-your current Python environment. Since the package is under fairly
-constant updates, you might want to follow the development with git. The
-source code, along with example scripts and some sample data files can
-be obtained from the github repository:
-<https://github.com/gb119/Stoner-PythonCode>
-
-The codebase is largely compatible with Python 3.4, with the expception
-of the 3D vector map plots which make use of Enthought's *mayavi*
-package which is still only Python 2 compatible due to the underlying
-Vtk toolkit. Other issues of broken 3.4 code are bugs to be squashed.
-
-Overview
-========
-
-The **Stoner** package provides two basic top-level classes that
-describe an individual file of experimental data and a list (such as a
-directory tree on disc) of many experimental files. For our research, a
-typical single experimental data file is essentially a single 2D table
-of floating point numbers with associated metadata, usually saved in
-some ASCII text format. This seems to cover most experiments in the
-physical sciences, but it you need a more complex format with more
-dimensions of data, we suggest you look elsewhere.
-
-DataFile and Friends
---------------------
-
-**Stoner.Core.DataFile** is the base class for representing individual
-experimental data sets. It provides basic methods to examine and
-manipulate data, manage metadata and load and save data files. It has a
-large number of sub classes - most of these are in Stoner.FileFormats
-and are used to handle the loading of specific file formats. Two,
-however, contain additional functionality for writing analysis programs.
-
--   **Stoner.Analysis.AnalyseFile** provides additional methods for curve-fitting, differentiating, smoothing and carrying out
-    :   basic calculations on data.
-
--   **Stoner.Plot.PlotFile** provides additional routines for plotting
-    data on 2D or 3D plots.
-
-As mentioned above, there are subclasses of **DataFile** in the
-**Stoner.FileFormats** module that support loading many of the common
-file formats used in our research.
-
-For rapid development of small scripts, we would recommend the
-**Stoner.Data** class which is a superclass of the above, and provides a
-'kitchen-sink' one stop shop for most of the package's functionality.
-
-DataFolder
-----------
-
-**Stoner.Folders.DataFolder** is a class for assisting with the work of
-processing lots of files in a common directory structure. It provides
-methods to list. filter and group data according to filename patterns or
-metadata and then to execute a function on each file or group of files.
-
-The **Stoner.HDF5** module provides some experimental classes to
-manipulate *DataFile* and *DataFolder* objects within HDF5 format files.
-These are not a way to handle arbitary HDF5 files - the format is much
-to complex and flexible to make that an easy task, rather it is a way to
-work with large numbers of experimental sets using just a single file
-which may be less brutal to your computer's OS than having directory
-trees with millions of individual files.
-
-Resources
-=========
-
-Included in the [github
-repository](http://www.github.com/gb119/Stoner-PythonCode/) are a
-(small) collection of sample scripts for carrying out various operations
-and some sample data files for testing the loading and processing of
-data. There is also a User\_Guide as part of this documentation, along
-with a complete API reference \<Stoner\>
-
-Contact and Licensing
-=====================
-
-The lead developer for this code is [Dr Gavin
-Burnell](http://www.stoner.leeds.ac.uk/people/gb)
-\<<g.burnell@leeds.ac.uk>\> . The User Guide gives the current list of
-other contributors to the project.
-
-This code and the sample data are all (C) The University of Leeds
-2008-2015 unless otherwise indficated in the source file. The contents
-of this package are licensed under the terms of the GNU Public License
-v3
-
-Recent Changes
-==============
-
-Development Version
--------------------
-
-The current development version is 0.6. This features some major changes
-in the architecture, switching from a numpy MaskedArray as the main data
-store to a custom sub-class that contains most of the logic for indexing
-data by column name and designation. The metadata storage has also been
-switched to using blist.sortteddict for a fast, alphabetically ordered
-dictionary storage. Other underlying changes are a switch to using
-properties rather than straight attribute access.
-
-0.6 also adds some extra methods to AnalyseFile for extrapolation.
-
-Online documentation for the development version can be found on the
-[githib repository pages](http://gb119.github.io/Stoner-PythonCode)
-
-[![image](https://zenodo.org/badge/17265/gb119/Stoner-PythonCode.svg)](https://zenodo.org/badge/latestdoi/17265/gb119/Stoner-PythonCode)
-
-Stable Version
---------------
-
-The development version is now in beta release and so no further relases
-will be made to the current stable release (0.5).
+Introduction
+============
+
+The *Stoner* Python package is a set of utility classes for writing data
+analysis code. It was written within the Condensed Matter Physics group
+at the University of Leeds as a shared resource for quickly writing
+simple programs to do things like fitting functions to data, extract
+curve parameters and churn through large numbers of small text data
+files.
+
+For a general introduction, users are referred to the Users Guide, which
+is part of the [online documentation](http://pythonhosted.org/Stoner/)
+along with the API Reference guide. The [github
+repository](http://www.github.com/gb119/Stoner-PythonCode/) also
+contains some example scripts.
+
+Getting this Code
+=================
+
+The *Stoner* package requires numpy \>=1.8, scipy \>=0.14, matplotlib
+\>=1.4, h5py, numba lmfit, filemagic, and blist. Experimental code also
+makes use of the Enthought Tools Suite packages.
+
+Ananconda Python (and probably other scientific Python distributions)
+include nearly all of the dependencies, aprt from lmfit. However, this
+can by installed with the usual tools such as *easy\_install* or *pip*.
+
+~~~~ {.sourceCode .sh}
+easy_install lmfit
+~~~~
+
+The easiest way to install the Stoner package is via seuptools'
+easy\_install
+
+~~~~ {.sourceCode .sh}
+easy_install Stoner
+~~~~
+
+This will install the Stoner package and any missing dependencies into
+your current Python environment. Since the package is under fairly
+constant updates, you might want to follow the development with git. The
+source code, along with example scripts and some sample data files can
+be obtained from the github repository:
+<https://github.com/gb119/Stoner-PythonCode>
+
+The codebase is largely compatible with Python 3.4, with the expception
+of the 3D vector map plots which make use of Enthought's *mayavi*
+package which is still only Python 2 compatible due to the underlying
+Vtk toolkit. Other issues of broken 3.4 code are bugs to be squashed.
+
+Overview
+========
+
+The **Stoner** package provides two basic top-level classes that
+describe an individual file of experimental data and a list (such as a
+directory tree on disc) of many experimental files. For our research, a
+typical single experimental data file is essentially a single 2D table
+of floating point numbers with associated metadata, usually saved in
+some ASCII text format. This seems to cover most experiments in the
+physical sciences, but it you need a more complex format with more
+dimensions of data, we suggest you look elsewhere.
+
+DataFile and Friends
+--------------------
+
+**Stoner.Core.DataFile** is the base class for representing individual
+experimental data sets. It provides basic methods to examine and
+manipulate data, manage metadata and load and save data files. It has a
+large number of sub classes - most of these are in Stoner.FileFormats
+and are used to handle the loading of specific file formats. Two,
+however, contain additional functionality for writing analysis programs.
+
+-   **Stoner.Analysis.AnalyseFile** provides additional methods for curve-fitting, differentiating, smoothing and carrying out
+    :   basic calculations on data.
+
+-   **Stoner.Plot.PlotFile** provides additional routines for plotting
+    data on 2D or 3D plots.
+
+As mentioned above, there are subclasses of **DataFile** in the
+**Stoner.FileFormats** module that support loading many of the common
+file formats used in our research.
+
+For rapid development of small scripts, we would recommend the
+**Stoner.Data** class which is a superclass of the above, and provides a
+'kitchen-sink' one stop shop for most of the package's functionality.
+
+DataFolder
+----------
+
+**Stoner.Folders.DataFolder** is a class for assisting with the work of
+processing lots of files in a common directory structure. It provides
+methods to list. filter and group data according to filename patterns or
+metadata and then to execute a function on each file or group of files.
+
+The **Stoner.HDF5** module provides some experimental classes to
+manipulate *DataFile* and *DataFolder* objects within HDF5 format files.
+These are not a way to handle arbitary HDF5 files - the format is much
+to complex and flexible to make that an easy task, rather it is a way to
+work with large numbers of experimental sets using just a single file
+which may be less brutal to your computer's OS than having directory
+trees with millions of individual files.
+
+Resources
+=========
+
+Included in the [github
+repository](http://www.github.com/gb119/Stoner-PythonCode/) are a
+(small) collection of sample scripts for carrying out various operations
+and some sample data files for testing the loading and processing of
+data. There is also a User\_Guide as part of this documentation, along
+with a complete API reference \<Stoner\>
+
+Contact and Licensing
+=====================
+
+The lead developer for this code is [Dr Gavin
+Burnell](http://www.stoner.leeds.ac.uk/people/gb)
+\<<g.burnell@leeds.ac.uk>\> . The User Guide gives the current list of
+other contributors to the project.
+
+This code and the sample data are all (C) The University of Leeds
+2008-2015 unless otherwise indficated in the source file. The contents
+of this package are licensed under the terms of the GNU Public License
+v3
+
+Recent Changes
+==============
+
+Development Version
+-------------------
+
+The current development version is 0.6. This features some major changes
+in the architecture, switching from a numpy MaskedArray as the main data
+store to a custom sub-class that contains most of the logic for indexing
+data by column name and designation. The metadata storage has also been
+switched to using blist.sortteddict for a fast, alphabetically ordered
+dictionary storage. Other underlying changes are a switch to using
+properties rather than straight attribute access.
+
+0.6 now also makes use of filemagic to work out the mime type of files
+to be loaded to try and improve the resilience of the automatic file
+format detection.
+
+0.6 also adds some extra methods to AnalyseFile for extrapolation.
+
+Online documentation for the development version can be found on the
+[githib repository pages](http://gb119.github.io/Stoner-PythonCode)
+
+[![image](https://zenodo.org/badge/17265/gb119/Stoner-PythonCode.svg)](https://zenodo.org/badge/latestdoi/17265/gb119/Stoner-PythonCode)
+
+Stable Version
+--------------
+
+The development version is now in beta release and so no further relases
+will be made to the current stable release (0.5).
diff --git a/Stoner/Core.py b/Stoner/Core.py
@@ -23,7 +23,10 @@
 import itertools
 from collections import Iterable, OrderedDict
 from blist import sorteddict
-
+try:
+    from magic import Magic as filemagic,MAGIC_MIME_TYPE
+except ImportError:
+    filemagic=None
 
 def copy_into(source,dest):
     """Copies the data associated with source to dest.
@@ -1335,6 +1338,9 @@ class DataFile(object):
     # the file load/save dialog boxes.
     patterns=["*.txt","*.tdi"] # Recognised filename patterns
 
+    #mimetypes we match
+    mime_type=["text/plain"]
+
     _conv_string = _np_.vectorize(lambda x: str(x))
     _conv_float = _np_.vectorize(lambda x: float(x))
 
@@ -2967,14 +2973,23 @@ def load(self, filename=None, auto_load=True, filetype=None, *args, **kargs):
 
         if not path.exists(self.filename):
             raise IOError("Cannot find {} to load".format(self.filename))
+        if filemagic is not None:
+            with filemagic(flags=MAGIC_MIME_TYPE) as m:
+                mimetype=m.id_filename(filename)
+            if self.debug:
+                print("Mimetype:{}".format(mimetype))
         cls = self.__class__
         failed = True
         if auto_load:  # We're going to try every subclass we canA
             for cls in self.subclasses.values():
-                if self.debug:
-                    print(cls.__name__)
                 try:
+                    if filemagic is not None and mimetype not in cls.mime_type: #short circuit for non-=matching mime-types
+                        if self.debug: print("Skipping {} due to mismatcb mime type {}".format(cls.__name__,cls.mime_type))                        
+                        continue
                     test = cls()
+                    if self.debug and filemagic is not None:
+                        print("Trying: {} =mimetype {}".format(cls.__name__,test.mime_type))
+
                     kargs.pop("auto_load",None)
                     test._load(self.filename,auto_load=False,*args,**kargs)
                     failed=False

diff --git a/Stoner/FileFormats.py b/Stoner/FileFormats.py
@@ -274,7 +274,7 @@ def _load(self, filename=None, *args, **kargs):
             if python_v3:
                 column_headers = f.readline().strip().split(',')
             else:
-                column_headers = f.next().strip().split(',')                
+                column_headers = f.next().strip().split(',')
         self.data = _np_.genfromtxt(self.filename, dtype='float', delimiter=',', invalid_raise=False, skip_header=i + 2)
         self.column_headers=column_headers
         self.setas(x="Magnetic Field", y="Moment")
@@ -346,6 +346,8 @@ class SPCFile(DataFile):
     # the file load/save dialog boxes.
     patterns=["*.spc"] # Recognised filename patterns
 
+    mime_type=["application/octet-stream"]
+
     def _load(self, filename=None, *args, **kargs):
         """Reads a .scf file produced by the Renishaw Raman system (amongs others)