Skip to content
This repository has been archived by the owner on Feb 2, 2024. It is now read-only.

Commit

Permalink
Merge d338c97 into 5fba799
Browse files Browse the repository at this point in the history
  • Loading branch information
rdesai16 committed Aug 30, 2019
2 parents 5fba799 + d338c97 commit 570a6b5
Show file tree
Hide file tree
Showing 8 changed files with 448 additions and 440 deletions.
14 changes: 10 additions & 4 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,13 +87,13 @@
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'alabaster'
html_theme = 'classic'

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
html_theme_options = {}

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
Expand Down Expand Up @@ -157,6 +157,12 @@
'Miscellaneous'),
]

numfig = True

# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {'https://docs.python.org/': None}
#configuration for intersphinx
intersphinx_mapping = {
'python': ('https://docs.python.org/', None),
'numba': ('http://numba.pydata.org/numba-doc/latest/index.html', None),
'numpy': ('http://docs.scipy.org/doc/numpy', None),
'pandas': ('https://pandas.pydata.org/pandas-docs/stable/', None),
}
8 changes: 3 additions & 5 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,10 @@ HPAT documentation

.. toctree::
:maxdepth: 2
:numbered:
:caption: Contents:

source/overview
source/supported
source/supportedpandas
source/notsupported
source/userguide
source/install
source/development
source/aws
source/references
53 changes: 0 additions & 53 deletions docs/source/aws.rst

This file was deleted.

51 changes: 49 additions & 2 deletions docs/source/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ easily. On Linux/Mac/Windows::
.. used if master of Numba is needed for latest hpat package
.. conda create -n HPAT -c ehsantn -c numba/label/dev -c anaconda -c conda-forge hpat
Building HPAT from Source
-------------------------
Building from Source on Linux
-----------------------------

We use `Anaconda <https://www.anaconda.com/download/>`_ distribution of
Python for setting up HPAT.
Expand Down Expand Up @@ -118,3 +118,50 @@ Troubleshooting Windows Build
* For setting up Visual Studio, one might need go to registry at
``HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\VisualStudio\SxS\VS7``,
and add a string value named ``14.0`` whose data is ``C:\Program Files (x86)\Microsoft Visual Studio 14.0\``.

AWS Setup
---------

This page describes a simple setup process for HPAT on Amazon EC2 instances. You need to have an account on Amazon Web Services (AWS)
and be familiar with the general AWS EC2 instance launch interface. The process below is for demonstration purposes only and is not
recommended for production usage due to security, performance and other considerations.

1. Launch instances:
a. Select a Linux instance type (e.g. Ubuntu Server 18.04, c5n types for high network bandwidth).
b. Select number of instances (e.g. 4).
c. Select placement group option for better network performance (check "add instance to placement group").
d. Enable all ports in security group configuration to simplify MPI setup (add a new rule with "All traffic" Type and "Anywhere" Source).

2. Setup password-less ssh between instances:
a. Copy your key from your client to all instances. For example, on a Linux clients run this for all instances (find public host names from AWS portal)::

scp -i "user.pem" user.pem ubuntu@ec2-11-111-11-111.us-east-2.compute.amazonaws.com:~/.ssh/id_rsa

b. Disable ssh host key check by running this command on all instances::

echo -e "Host *\n StrictHostKeyChecking no" > .ssh/config

c. Create a host file with list of private hostnames of instances on home directory of all instances::

echo -e "ip-11-11-11-11.us-east-2.compute.internal\nip-11-11-11-12.us-east-2.compute.internal\n" > hosts

3. Install Anaconda Python distribution and HPAT on all instances::

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
chmod +x miniconda.sh
./miniconda.sh -b
export PATH=$HOME/miniconda3/bin:$PATH
conda create -n HPAT -c ehsantn -c anaconda -c conda-forge hpat
source activate HPAT

4. Copy the `Pi example <https://github.com/IntelLabs/hpat#example>`_ to a file called pi.py in the home directory of all instances and run it with and without MPI and see execution times.
You should see speed up when running on more cores ("-n 2" and "-n 4" cases)::

python pi.py # Execution time: 2.119
mpiexec -f hosts -n 2 python pi.py # Execution time: 1.0569
mpiexec -f hosts -n 4 python pi.py # Execution time: 0.5286


Possible next experiments from here are running a more complex example like the
`logistic regression example <https://github.com/IntelLabs/hpat/blob/master/examples/logistic_regression_rand.py>`_.
Furthermore, attaching a shared EFS storage volume and experimenting with parallel I/O in HPAT is recommended.
35 changes: 0 additions & 35 deletions docs/source/notsupported.rst

This file was deleted.

35 changes: 16 additions & 19 deletions docs/source/development.rst → docs/source/references.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,19 @@
.. _development:
.. _references:

HPAT Development
================
References
==========

Technology Overview and Architecture
------------------------------------
HPAT implements Pandas and Numpy API as a DSL.
Data structures are implemented as Numba extensions, and
compiler stages are responsible for different levels of abstraction.
For example, `Series data type support <https://github.com/IntelLabs/hpat/blob/master/hpat/hiframes/pd_series_ext.py>`_
and `Series transformations <https://github.com/IntelLabs/hpat/blob/master/hpat/hiframes/hiframes_typed.py>`_
implement the `Pandas Series API <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html>`_.
Follow the pipeline for a simple function like `Series.sum()`
for initial understanding of the transformations.

HPAT Technology Overview
------------------------

This `slide deck <https://drive.google.com/open?id=1jLikSEAqOFf8kKO8vgT7ru6dKU1LGiDR>`_
provides an overview of HPAT technology and software architecture.
Expand All @@ -17,8 +26,8 @@ These papers provide deeper dive in technical ideas (might not be necessary for
- `ParallelAccelerator DSL approach <https://users.soe.ucsc.edu/~lkuper/papers/parallelaccelerator-ecoop17.pdf>`_


Numba Development
-----------------
Numba
-----

HPAT sits on top of Numba and is heavily tied to many of its features.
Therefore, understanding Numba's internal details and being able to develop Numba extensions
Expand Down Expand Up @@ -51,15 +60,3 @@ is necessary.
`get_file_size <https://github.com/IntelLabs/hpat/blob/master/hpat/io.py#L12>`_.
- | `Developer reference manual <http://numba.pydata.org/numba-doc/latest/developer/index.html>`_
provides more details if necessary.

HPAT Development
----------------

HPAT implements Pandas and Numpy API as a DSL.
Data structures are implemented as Numba extensions, and
compiler stages are responsible for different levels of abstraction.
For example, `Series data type support <https://github.com/IntelLabs/hpat/blob/master/hpat/hiframes/pd_series_ext.py>`_
and `Series transformations <https://github.com/IntelLabs/hpat/blob/master/hpat/hiframes/hiframes_typed.py>`_
implement the `Pandas Series API <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html>`_.
Follow the pipeline for a simple function like `Series.sum()`
for initial understanding of the transformations.
Loading

0 comments on commit 570a6b5

Please sign in to comment.