Skip to content

Commit

Permalink
Merge pull request #1 from IntelPython/master
Browse files Browse the repository at this point in the history
Merge changes from origin repo
  • Loading branch information
Vyacheslav-Smirnov committed Jul 31, 2019
2 parents f5469c4 + 95a7bc4 commit 9313df0
Show file tree
Hide file tree
Showing 25 changed files with 658 additions and 179 deletions.
4 changes: 0 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,6 @@ matrix:
- env: CONDA_ENV=travisci NUM_PES=2
- env: CONDA_ENV=travisci NUM_PES=3

branches:
only:
- master

before_install:
- buildscripts/setup_conda.sh
- export PATH=$HOME/miniconda3/bin:$PATH
Expand Down
9 changes: 5 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,17 +31,18 @@ environment easily (Linux/Mac/Windows)::
.. used if master of Numba is needed for latest hpat package
.. conda create -n HPAT -c ehsantn -c numba/label/dev -c anaconda -c conda-forge hpat
Windows installaton requires
`Intel MPI <https://software.intel.com/en-us/intel-mpi-library>`_ to be
installed.

Docker Container
----------------

An HPAT docker image is also available for running containers. For example::

docker run -it ehsantn/hpat bash

Building HPAT from Source
-------------------------

To build HPAT from Source, please refer to the following `instrunction <docs/source/install.rst>`_

Example
#######

Expand Down
5 changes: 4 additions & 1 deletion buildscripts/hpat-conda-recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ requirements:
- hdf5
- h5py
- mpich # [not win]
- impi-devel # [win]
- impi_rt # [win]

run:
- python
Expand All @@ -35,6 +37,7 @@ requirements:
- boost
- numba 0.44.*
- mpich # [not win]
- impi_rt # [win]

test:
requires:
Expand All @@ -46,7 +49,7 @@ test:


about:
home: https://github.com/IntelLabs/hpat
home: https://github.com/IntelPython/hpat
license: BSD
license_file: LICENSE.md
summary: A compiler-based big data framework in Python
2 changes: 1 addition & 1 deletion buildscripts/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ python hpat/tests/gen_test_data.py
if [ "$RUN_COVERAGE" == "yes" ]; then
export PYTHONPATH=.
coverage erase
coverage run --source=./hpat --omit ./hpat/ml/*,./hpat/xenon_ext.py,./hpat/ros.py,./hpat/cv_ext.py,./hpat/tests/gen_test_data.py -m unittest
coverage run --source=./hpat --omit ./hpat/ml/*,./hpat/xenon_ext.py,./hpat/ros.py,./hpat/cv_ext.py,./hpat/tests/* -m unittest
else
mpiexec -n $NUM_PES python -u -m unittest -v
fi
57 changes: 44 additions & 13 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,29 +11,45 @@ easily. On Linux/Mac/Windows::
.. used if master of Numba is needed for latest hpat package
.. conda create -n HPAT -c ehsantn -c numba/label/dev -c anaconda -c conda-forge hpat
Windows installaton requires
`Intel MPI <https://software.intel.com/en-us/intel-mpi-library>`_ to be
installed.

Building HPAT from Source
-------------------------

We use `Anaconda <https://www.anaconda.com/download/>`_ distribution of
Python for setting up HPAT. These commands install HPAT and its dependencies
such as Numba on Ubuntu Linux::
Python for setting up HPAT.

Miniconda3 is required for build::

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
chmod +x miniconda.sh
./miniconda.sh -b
export PATH=$HOME/miniconda3/bin:$PATH
conda create -n HPAT -q -y numpy scipy pandas boost cmake

It is possible to build HPAT via conda-build or setuptools. Follow one of the cases below to install HPAT and its dependencies
such as Numba on Ubuntu Linux.

Build with conda-build:
~~~~~~~~~~~~~~~~~~~~~~~
::

conda create -n HPAT python=<3.7 or 3.6>
source activate HPAT
conda install conda-build
git clone https://github.com/IntelPython/hpat
# build HPAT
conda build --python <3.6 or 3.7> -c numba -c conda-forge -c defaults hpat/buildscripts/hpat-conda-recipe/

Build with setuptools:
~~~~~~~~~~~~~~~~~~~~~~
::

conda create -n HPAT -q -y numpy scipy pandas boost cmake python=<3.6 or 3.7>
source activate HPAT
conda install -c numba/label/dev numba
conda install mpich mpi -c conda-forge
conda install pyarrow
conda install h5py -c ehsantn
conda install gcc_linux-64 gxx_linux-64 gfortran_linux-64
git clone https://github.com/IntelLabs/hpat
git clone https://github.com/IntelPython/hpat
cd hpat
# build HPAT
HDF5_DIR=$CONDA_PREFIX python setup.py develop
Expand All @@ -57,22 +73,37 @@ to check the channel of ``hdf5`` package.
Building from Source on Windows
-------------------------------

Building HPAT on Windows requires Build Tools for Visual Studio 2017 (14.0) and Intel MPI:
Building HPAT on Windows requires Build Tools for Visual Studio 2017 (14.0):

* Install `Build Tools for Visual Studio 2017 (14.0) <https://www.visualstudio.com/downloads/#build-tools-for-visual-studio-2017>`_.
* Install `Intel MPI <https://software.intel.com/en-us/intel-mpi-library>`_.
* Install `Miniconda for Windows <https://repo.continuum.io/miniconda/Miniconda3-latest-Windows-x86_64.exe>`_.
* Start 'Anaconda prompt'
* Setup the Conda environment in Anaconda Prompt::

conda create -n HPAT -c ehsantn -c numba/label/dev -c anaconda -c conda-forge python=3.7 pandas pyarrow h5py numba scipy boost libboost tbb-devel mkl-devel
It is possible to build HPAT via conda-build or setuptools. Follow one of the cases below to install HPAT and its dependencies on Windows.

Build with conda-build:
~~~~~~~~~~~~~~~~~~~~~~~
::

conda create -n HPAT python=<3.7 or 3.6>
activate HPAT
conda install vc vs2015_runtime vs2015_win-64
git clone https://github.com/IntelPython/hpat.git
conda build --python <3.6 or 3.7> -c numba -c conda-forge -c defaults -c intel hpat/buildscripts/hpat-conda-recipe/

Build with setuptools:
~~~~~~~~~~~~~~~~~~~~~~
::

conda create -n HPAT -c ehsantn -c numba/label/dev -c anaconda -c conda-forge -c intel python=<3.6 or 3.7> pandas pyarrow h5py numba scipy boost libboost tbb-devel mkl-devel impi-devel impi_rt
activate HPAT
conda install vc vs2015_runtime vs2015_win-64
git clone https://github.com/IntelLabs/hpat.git
git clone https://github.com/IntelPython/hpat.git
cd hpat
set INCLUDE=%INCLUDE%;%CONDA_PREFIX%\Library\include
set LIB=%LIB%;%CONDA_PREFIX%\Library\lib
"%I_MPI_ROOT%"\intel64\bin\mpivars.bat
%CONDA_PREFIX%\Library\bin\mpivars.bat quiet
set HDF5_DIR=%CONDA_PREFIX%\Library
python setup.py develop

Expand Down
7 changes: 2 additions & 5 deletions hpat/_distributed.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,8 @@ PyMODINIT_FUNC PyInit_hdist(void) {
PyLong_FromVoidPtr((void*)(&oneD_reshape_shuffle)));
PyObject_SetAttrString(m, "permutation_int",
PyLong_FromVoidPtr((void*)(&permutation_int)));
PyObject_SetAttrString(
m, "permutation_array_index",
PyLong_FromVoidPtr((void*)(&permutation_array_index)));
PyObject_SetAttrString(m, "fix_i_malloc",
PyLong_FromVoidPtr((void*)(&fix_i_malloc)));
PyObject_SetAttrString(m, "permutation_array_index",
PyLong_FromVoidPtr((void*)(&permutation_array_index)));

// add actual int value to module
PyObject_SetAttrString(m, "mpi_req_num_bytes",
Expand Down
16 changes: 0 additions & 16 deletions hpat/_distributed.h
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,6 @@ static void permutation_int(int64_t* output, int n) __UNUSED__;
static void permutation_array_index(unsigned char *lhs, int64_t len, int64_t elem_size,
unsigned char *rhs, int64_t *p, int64_t p_len) __UNUSED__;
static int hpat_finalize() __UNUSED__;
static void fix_i_malloc() __UNUSED__;
static int hpat_dummy_ptr[64] __UNUSED__;

/* *********************************************************************
Expand Down Expand Up @@ -826,19 +825,4 @@ static void oneD_reshape_shuffle(char* output,
delete[] recv_disp;
}

// fix for tensorflows MKL support that overwrites Intel mallocs,
// which causes Intel MPI to crash.
#ifdef I_MPI_VERSION
#include "i_malloc.h"
static void fix_i_malloc()
{
i_malloc = malloc;
i_calloc = calloc;
i_realloc = realloc;
i_free = free;
}
#else
static void fix_i_malloc() {}
#endif

#endif // _DISTRIBUTED_H_INCLUDED
3 changes: 3 additions & 0 deletions hpat/_str_decode.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
#include <iostream>
#include "_meminfo.h"

#ifndef Py_UNREACHABLE
#define Py_UNREACHABLE() abort()
#endif

// ******** ported from CPython 31e8d69bfe7cf5d4ffe0967cb225d2a8a229cc97

Expand Down
7 changes: 0 additions & 7 deletions hpat/distributed_lower.py
Original file line number Diff line number Diff line change
Expand Up @@ -549,13 +549,6 @@ def dist_permutation_array_index(lhs, lhs_len, dtype_size, rhs, p, p_len):
permutation_array_index(lhs.ctypes, lhs_len, elem_size, c_rhs.ctypes,
p.ctypes, p_len)


ll.add_symbol('fix_i_malloc', hdist.fix_i_malloc)
_fix_i_malloc = types.ExternalFunction("fix_i_malloc", types.void())
@numba.njit
def fix_i_malloc():
_fix_i_malloc()

########### finalize MPI when exiting ####################

def hpat_finalize():
Expand Down
4 changes: 2 additions & 2 deletions hpat/hiframes/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -361,8 +361,8 @@ def generic(self, args, kws):
class FillNaStrType(AbstractTemplate):
def generic(self, args, kws):
assert not kws
assert len(args) == 2
# args: in_arr, value
assert len(args) == 3
# args: in_arr, value, name
return signature(SeriesType(string_type), *args)

@infer_global(dropna)
Expand Down
13 changes: 5 additions & 8 deletions hpat/hiframes/hiframes_typed.py
Original file line number Diff line number Diff line change
Expand Up @@ -1114,6 +1114,7 @@ def _run_call_series_fillna(self, assign, lhs, rhs, series_var):
val = rhs.args[0]
nodes = []
data = self._get_series_data(series_var, nodes)
name = self._get_series_name(series_var, nodes)
kws = dict(rhs.kws)
inplace = False
if 'inplace' in kws:
Expand All @@ -1133,18 +1134,14 @@ def _run_call_series_fillna(self, assign, lhs, rhs, series_var):
# array and assign it back to the same Series variable
# result back to the same variable
# TODO: handle string array reflection
def str_fillna_impl(A, fill):
def str_fillna_impl(A, fill, name):
# not using A.fillna since definition list is not working
# for A to find callname
return hpat.hiframes.api.fillna_str_alloc(A, fill)
return hpat.hiframes.api.fillna_str_alloc(A, fill, name)
#A.fillna(fill)

fill_var = rhs.args[0]
assign.target = series_var # replace output
return self._replace_func(
str_fillna_impl,
[data, fill_var],
pre_nodes=nodes)
return self._replace_func(str_fillna_impl, [data, val, name], pre_nodes=nodes)
else:
return self._replace_func(
lambda a,b,c: hpat.hiframes.api.fillna(a,b,c),
Expand All @@ -1155,7 +1152,7 @@ def str_fillna_impl(A, fill):
func = series_replace_funcs['fillna_str_alloc']
else:
func = series_replace_funcs['fillna_alloc']
return self._replace_func(func, [data, val], pre_nodes=nodes)
return self._replace_func(func, [data, val, name], pre_nodes=nodes)

def _run_call_series_dropna(self, assign, lhs, rhs, series_var):
dtype = self.typemap[series_var.name].dtype
Expand Down
8 changes: 4 additions & 4 deletions hpat/hiframes/series_kernels.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def _column_fillna_impl(A, B, fill): # pragma: no cover
s = fill
A[i] = s

def _series_fillna_str_alloc_impl(B, fill): # pragma: no cover
def _series_fillna_str_alloc_impl(B, fill, name): # pragma: no cover
n = len(B)
num_chars = 0
# get total chars in new array
Expand All @@ -55,7 +55,7 @@ def _series_fillna_str_alloc_impl(B, fill): # pragma: no cover
num_chars += len(s)
A = hpat.str_arr_ext.pre_alloc_string_array(n, num_chars)
hpat.hiframes.api.fillna(A, B, fill)
return hpat.hiframes.api.init_series(A)
return hpat.hiframes.api.init_series(A, None, name)

def _series_dropna_float_impl(S, name): # pragma: no cover
old_len = len(S)
Expand Down Expand Up @@ -304,11 +304,11 @@ def _column_describe_impl(S): # pragma: no cover
"max " + str(a_max) + "\n"
return res

def _column_fillna_alloc_impl(S, val): # pragma: no cover
def _column_fillna_alloc_impl(S, val, name): # pragma: no cover
# TODO: handle string, etc.
B = np.empty(len(S), S.dtype)
hpat.hiframes.api.fillna(B, S, val)
return hpat.hiframes.api.init_series(B)
return hpat.hiframes.api.init_series(B, None, name)


def _str_contains_regex_impl(str_arr, pat): # pragma: no cover
Expand Down
47 changes: 31 additions & 16 deletions hpat/str_ext.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,22 +52,37 @@ def unliteral_all(args):

## use objmode for string methods for now

# string methods that just return another string
str2str_methods = ('capitalize', 'casefold', 'lower', 'lstrip', 'rstrip',
'strip', 'swapcase', 'title', 'upper')

for method in str2str_methods:
func_text = "def str_overload(in_str):\n"
func_text += " def _str_impl(in_str):\n"
func_text += " with numba.objmode(out='unicode_type'):\n"
func_text += " out = in_str.{}()\n".format(method)
func_text += " return out\n"
func_text += " return _str_impl\n"
loc_vars = {}
exec(func_text, {'numba': numba}, loc_vars)
str_overload = loc_vars['str_overload']
overload_method(types.UnicodeType, method)(str_overload)

# string methods that take no arguments and return a string
str2str_noargs = ('capitalize', 'casefold', 'lower', 'swapcase', 'title', 'upper')

def str_overload_noargs(method):
@overload_method(types.UnicodeType, method)
def str_overload(in_str):
def _str_impl(in_str):
with numba.objmode(out='unicode_type'):
out = getattr(in_str, method)()
return out

return _str_impl

for method in str2str_noargs:
str_overload_noargs(method)

# strip string methods that take one argument and return a string
str2str_1arg = ('lstrip', 'rstrip', 'strip')

def str_overload_1arg(method):
@overload_method(types.UnicodeType, method)
def str_overload(in_str, arg1):
def _str_impl(in_str, arg1):
with numba.objmode(out='unicode_type'):
out = getattr(in_str, method)(arg1)
return out

return _str_impl

for method in str2str_1arg:
str_overload_1arg(method)

@overload_method(types.UnicodeType, 'replace')
def str_replace_overload(in_str, old, new, count=-1):
Expand Down
Loading

0 comments on commit 9313df0

Please sign in to comment.