Skip to content

Commit

Permalink
Merge pull request #72 from gmrukwa/develop
Browse files Browse the repository at this point in the history
Release 2.5.12
  • Loading branch information
gmrukwa committed Dec 13, 2020
2 parents 0aa7958 + 42f0e16 commit 8011932
Show file tree
Hide file tree
Showing 17 changed files with 111 additions and 75 deletions.
12 changes: 6 additions & 6 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ on:
env:
MAJOR: ${{ 2 }}
MINOR: ${{ 5 }}
FIXUP: ${{ 11 }}
FIXUP: ${{ 12 }}
PACKAGE_INIT_FILE: ${{ 'divik/__init__.py' }}
PACKAGE_INIT_FILE_VERSION_LINE: ${{ 1 }}
PACKAGE_SETUP_FILE: ${{ 'setup.py' }}
Expand Down Expand Up @@ -202,11 +202,11 @@ jobs:
run: brew install libomp
- name: Build Python package
run: |
echo "::add-path::/usr/local/opt/llvm/bin"
echo "::set-env name=C_INCLUDE_PATH::/usr/local/opt/llvm/include:$C_INCLUDE_PATH"
echo "::set-env name=CPLUS_INCLUDE_PATH::/usr/local/opt/llvm/include:$CPLUS_INCLUDE_PATH"
echo "::set-env name=LIBRARY_PATH::/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$LIBRARY_PATH"
echo "::set-env name=DYLD_LIBRARY_PATH::/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$DYLD_LIBRARY_PATH"
echo "/usr/local/opt/llvm/bin" >> $GITHUB_PATH
echo "C_INCLUDE_PATH=/usr/local/opt/llvm/include:$C_INCLUDE_PATH" >> $GITHUB_ENV
echo "CPLUS_INCLUDE_PATH=/usr/local/opt/llvm/include:$CPLUS_INCLUDE_PATH" >> $GITHUB_ENV
echo "LIBRARY_PATH=/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$LIBRARY_PATH" >> $GITHUB_ENV
echo "DYLD_LIBRARY_PATH=/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$DYLD_LIBRARY_PATH" >> $GITHUB_ENV
pip wheel . -w dist
env:
CC: /usr/local/opt/llvm/bin/clang
Expand Down
10 changes: 5 additions & 5 deletions .github/workflows/unittest.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,11 +74,11 @@ jobs:
run: brew install libomp
- name: Install native lib
run: |
echo "::add-path::/usr/local/opt/llvm/bin"
echo "::set-env name=C_INCLUDE_PATH::/usr/local/opt/llvm/include:$C_INCLUDE_PATH"
echo "::set-env name=CPLUS_INCLUDE_PATH::/usr/local/opt/llvm/include:$CPLUS_INCLUDE_PATH"
echo "::set-env name=LIBRARY_PATH::/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$LIBRARY_PATH"
echo "::set-env name=DYLD_LIBRARY_PATH::/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$DYLD_LIBRARY_PATH"
echo "/usr/local/opt/llvm/bin" >> $GITHUB_PATH
echo "C_INCLUDE_PATH=/usr/local/opt/llvm/include:$C_INCLUDE_PATH" >> $GITHUB_ENV
echo "CPLUS_INCLUDE_PATH=/usr/local/opt/llvm/include:$CPLUS_INCLUDE_PATH" >> $GITHUB_ENV
echo "LIBRARY_PATH=/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$LIBRARY_PATH" >> $GITHUB_ENV
echo "DYLD_LIBRARY_PATH=/usr/local/opt/llvm/lib:/usr/local/opt/libomp/lib:$DYLD_LIBRARY_PATH" >> $GITHUB_ENV
python dev_setup.py install
env:
CC: /usr/local/opt/llvm/bin/clang
Expand Down
16 changes: 14 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ docker pull gmrukwa/divik
To install specific version, you can specify it in the command, e.g.:

```bash
docker pull gmrukwa/divik:2.5.11
docker pull gmrukwa/divik:2.5.12
```

## Python package
Expand Down Expand Up @@ -79,7 +79,7 @@ pip install divik
or any stable tagged version, e.g.:

```bash
pip install divik==2.5.11
pip install divik==2.5.12
```

If you want to have compatibility with
Expand All @@ -92,6 +92,18 @@ pip install divik[gin]

**Note:** Remember about `\` before `[` and `]` in `zsh` shell.

If you want to launch `inspect` tool, you need to install extras with:

```bash
pip install divik[inspect]
```

You can install all extras with:

```bash
pip install divik[all]
```

# High-Volume Data Considerations

If you are using DiviK to run the analysis that could fail to fit RAM of your
Expand Down
2 changes: 1 addition & 1 deletion divik/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '2.5.11'
__version__ = '2.5.12'

from ._summary import plot, reject_split

Expand Down
2 changes: 1 addition & 1 deletion divik/_cli/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

import divik.core as u
from divik import __version__
from divik._cli._data_io import load_data
from divik.core.io import load_data


def parse_args():
Expand Down
2 changes: 1 addition & 1 deletion divik/_cli/divik.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
import skimage.io as sio

from divik.core import build
from divik._cli._data_io import DIVIK_RESULT_FNAME
from divik.core.io import DIVIK_RESULT_FNAME
from divik.cluster import DiviK, GAPSearch, KMeans
import divik._summary as _smr
import divik._cli._utils as sc
Expand Down
2 changes: 1 addition & 1 deletion divik/_cli/fit_clusters.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import gin

from divik.core import dump_gin_args, parse_gin_args
from ._model_io import save
from divik.core.io import save
from ._utils import (
prepare_destination,
setup_logger,
Expand Down
39 changes: 38 additions & 1 deletion divik/_cli/inspect.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
import argparse as agp
import glob
import logging
import os
from itertools import chain
from typing import List

from divik._cli._data_io import as_divik_result_path
from divik.core.io import DIVIK_RESULT_FNAME
from divik._inspect.app import app, divik_result, xy
# noinspection PyUnresolvedReferences
import divik._inspect.callback
Expand All @@ -23,6 +28,38 @@ def parse_args():
return parser.parse_args()


def _result_path_patterns(slug: str) -> List[str]:
slug_pattern = '*{0}*'.format(slug)
direct = os.path.join(slug_pattern, DIVIK_RESULT_FNAME)
prefixed = os.path.join('**', slug_pattern, DIVIK_RESULT_FNAME)
suffixed = os.path.join(slug_pattern, '**', DIVIK_RESULT_FNAME)
bothfixed = os.path.join('**', slug_pattern, '**', DIVIK_RESULT_FNAME)
return list((direct, prefixed, suffixed, bothfixed))


def _find_possible_directories(patterns: List[str]) -> List[str]:
possible_locations = chain.from_iterable(
glob.glob(pattern, recursive=True) for pattern in patterns)
possible_paths = list({
os.path.split(fname)[0] for fname in possible_locations
})
return possible_paths


def as_divik_result_path(path_or_slug: str):
possible_location = os.path.join(path_or_slug, DIVIK_RESULT_FNAME)
if os.path.exists(possible_location):
return path_or_slug
patterns = _result_path_patterns(path_or_slug)
possible_paths = _find_possible_directories(patterns)
if not possible_paths:
raise FileNotFoundError(path_or_slug)
if len(possible_paths) > 1:
msg = 'Multiple possible result directories: {0}. Selecting {1}.'
logging.warning(msg.format(possible_paths, possible_paths[0]))
return possible_paths[0]


def main():
args = parse_args()
path = as_divik_result_path(args.result)
Expand Down
2 changes: 1 addition & 1 deletion divik/_cli/visualize.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from skimage.io import imsave

from divik._cli._data_io import load_data
from divik.core.io import load_data
from divik.core import visualize


Expand Down
2 changes: 1 addition & 1 deletion divik/_inspect/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import numpy as np

from divik.core import DivikResult
from divik._cli._data_io import load_data
from divik.core.io import load_data


external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
Expand Down
9 changes: 9 additions & 0 deletions divik/core/io/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
from ._data_io import (
load_data,
)
from ._model_io import (
saver,
save
)

DIVIK_RESULT_FNAME = 'result.pkl'
38 changes: 0 additions & 38 deletions divik/_cli/_data_io.py → divik/core/io/_data_io.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
from itertools import chain
import glob
import logging
import os
from functools import partial
from typing import List

import h5py
import numpy as np
Expand Down Expand Up @@ -55,38 +52,3 @@ def load_data(path: str) -> u.Data:
logging.error(message)
raise IOError(message)
return loader(path)


DIVIK_RESULT_FNAME = 'result.pkl'


def _result_path_patterns(slug: str) -> List[str]:
slug_pattern = '*{0}*'.format(slug)
direct = os.path.join(slug_pattern, DIVIK_RESULT_FNAME)
prefixed = os.path.join('**', slug_pattern, DIVIK_RESULT_FNAME)
suffixed = os.path.join(slug_pattern, '**', DIVIK_RESULT_FNAME)
bothfixed = os.path.join('**', slug_pattern, '**', DIVIK_RESULT_FNAME)
return list((direct, prefixed, suffixed, bothfixed))


def _find_possible_directories(patterns: List[str]) -> List[str]:
possible_locations = chain.from_iterable(
glob.glob(pattern, recursive=True) for pattern in patterns)
possible_paths = list({
os.path.split(fname)[0] for fname in possible_locations
})
return possible_paths


def as_divik_result_path(path_or_slug: str):
possible_location = os.path.join(path_or_slug, DIVIK_RESULT_FNAME)
if os.path.exists(possible_location):
return path_or_slug
patterns = _result_path_patterns(path_or_slug)
possible_paths = _find_possible_directories(patterns)
if not possible_paths:
raise FileNotFoundError(path_or_slug)
if len(possible_paths) > 1:
msg = 'Multiple possible result directories: {0}. Selecting {1}.'
logging.warning(msg.format(possible_paths, possible_paths[0]))
return possible_paths[0]
5 changes: 3 additions & 2 deletions divik/_cli/_model_io.py → divik/core/io/_model_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def save_divik(model, destination, **kwargs):
if not isinstance(model.result_, DivikResult):
logging.info("Skipping DiviK details save. Cause: result is None")
return
from .divik import make_merged, save_merged
from divik._cli.divik import make_merged, save_merged
logging.info('Saving DiviK details.')
logging.info('Saving DivikResult pickle.')
with open(os.path.join(destination, 'result.pkl'), 'wb') as pkl:
Expand Down Expand Up @@ -106,6 +106,7 @@ def save_cluster_paths(model, destination, **kwargs):
'cluster_number': list(model.reverse_paths_.values())
}).to_csv(os.path.join(destination, 'paths.csv'))


@saver
def save_pipeline(model, destination, **kwargs):
if not isinstance(model, Pipeline):
Expand Down Expand Up @@ -136,7 +137,7 @@ def save_pipeline(model, destination, **kwargs):
np.savetxt(os.path.join(destination, 'final_partition.csv'), clustering.labels_,
delimiter=', ', fmt='%i')
if not os.path.exists(os.path.join(destination, 'partition-0.png')):
from .divik import save_merged
from divik._cli.divik import save_merged
save_merged(
destination,
clustering.labels_.reshape(-1, 1),
Expand Down
7 changes: 4 additions & 3 deletions divik/feature_selection/_gmm_selector.py
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,10 @@ def __init__(self, stat: str, use_log: bool = False,
n_candidates: int = None, min_features: int = 1,
min_features_rate: float = .0, preserve_high: bool = True,
max_components: int = 10):
if stat not in {'mean', 'var'}:
logging.error('stat must be one of {"mean", "var"}')
raise ValueError('stat must be one of {"mean", "var"}')
if stat not in {'cv', 'mean', 'var'} and not callable(stat):
msg = 'stat must be one of {"cv", "mean", "var"} or callable'
logging.error(msg)
raise ValueError(msg)
self.stat = stat
self.use_log = use_log
self.n_candidates = n_candidates
Expand Down
12 changes: 10 additions & 2 deletions divik/feature_selection/_stat_selector_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,17 @@ def _to_characteristics(self, X):
vals = np.mean(X, axis=0)
elif self.stat == 'var':
vals = np.var(X, axis=0)
elif self.stat == 'cv':
vals = np.std(X, axis=0) / np.mean(X, axis=0)
elif callable(self.stat):
vals = self.stat(X)
if vals.size != X.shape[1]:
raise RuntimeError(
'Computed statistic shape mismatch {0}'.format(vals.shape))
else:
logging.error('stat must be one of {"mean", "var"}')
raise ValueError('stat must be one of {"mean", "var"}')
msg = 'stat must be one of {"cv", "mean", "var"} or callable'
logging.error(msg)
raise ValueError(msg)

if hasattr(self, 'use_log') and self.use_log:
if np.any(vals < 0):
Expand Down
4 changes: 2 additions & 2 deletions docs/instructions/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ To install latest stable version use::

To install specific version, you can specify it in the command, e.g.::

docker pull gmrukwa/divik:2.5.11
docker pull gmrukwa/divik:2.5.12

Python package
--------------
Expand All @@ -31,7 +31,7 @@ package::

or any stable tagged version, e.g.::

pip install divik==2.5.11
pip install divik==2.5.12

If you want to have compatibility with
`gin-config <https://github.com/google/gin-config>`_, you can install
Expand Down
22 changes: 14 additions & 8 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import sys
import numpy

__version__ = '2.5.11'
__version__ = '2.5.12'

LINUX_OPTS = {
'extra_link_args': [
Expand Down Expand Up @@ -89,10 +89,6 @@
packages=find_packages(exclude=['test']),
# @gmrukwa: https://packaging.python.org/discussions/install-requires-vs-requirements/
install_requires=[
'dash==0.34.0',
'dash-html-components==0.13.4',
'dash-core-components==0.42.0',
'dash-table==3.1.11',
'dask[dataframe]>=2.14.0',
'dask-distance>=0.2.0',
'h5py>=2.8.0',
Expand All @@ -111,14 +107,24 @@
'all': [
'absl-py',
'gin-config',
'dash==0.34.0',
'dash-html-components==0.13.4',
'dash-core-components==0.42.0',
'dash-table==3.1.11',
'polyaxon',
],
'gin': [
"absl-py",
"gin-config",
'absl-py',
'gin-config',
],
'inspect': [
'dash==0.34.0',
'dash-html-components==0.13.4',
'dash-core-components==0.42.0',
'dash-table==3.1.11',
],
'polyaxon': [
"polyaxon",
'polyaxon',
],
},
python_requires='>=3.6',
Expand Down

0 comments on commit 8011932

Please sign in to comment.