Skip to content
This repository has been archived by the owner on Aug 5, 2022. It is now read-only.

Commit

Permalink
Release v0.3 (see CHANGELOG.txt for list of changes)
Browse files Browse the repository at this point in the history
Signed-off-by: Michael Klemm <michael.klemm@intel.com>
  • Loading branch information
Michael Klemm committed Feb 5, 2015
1 parent f682530 commit e1428ff
Show file tree
Hide file tree
Showing 65 changed files with 5,219 additions and 1,506 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Expand Up @@ -69,3 +69,6 @@ pyMIC-????-??-??.tbz2
*.csv
dgemm_c.x

# IDEA specific ignores
.idea/workspace.xml
.idea/tasks.xml
41 changes: 26 additions & 15 deletions CHANGELOG.txt
@@ -1,15 +1,26 @@
Version 0.2
----------------------------
- Small improvements to the README files.
- New example: Singular Value Decomposition.
- Some documentation for the API functions.
- Added a basic testsuite for unit testing (WIP).
- Bugfix: benchmarks now use the latest interface.
- Bugfix: numpy.ndarray does not offer an attribute 'order'.
- Bugfix: number_of_devices was not visible after import.
- Bugfix: member offload_array.device is now initialized.
- Bugfix: use exception for errors w/ invoke_kernel & load_library.

Version 0.1
----------------------------
Initial release.
Version 0.3
----------------------------

- Improved handling of libraries and kernel invocation.
- Trace collection (PYMIC_TRACE=1, PYMIC_TRACE_STACKS={none,compact,full}).
- Replaced the device-centric API with a stream API.
- Refactoring to better match PEP8 recommendations.
- Added support for int(int64) and complex(complex128) data types.
- Reworked the benchmarks and examples to fit the new API.
- Bugfix: fixed syntax errors in OffloadArray.

Version 0.2
----------------------------
- Small improvements to the README files.
- New example: Singular Value Decomposition.
- Some documentation for the API functions.
- Added a basic testsuite for unit testing (WIP).
- Bugfix: benchmarks now use the latest interface.
- Bugfix: numpy.ndarray does not offer an attribute 'order'.
- Bugfix: number_of_devices was not visible after import.
- Bugfix: member offload_array.device is now initialized.
- Bugfix: use exception for errors w/ invoke_kernel & load_library.

Version 0.1
----------------------------
Initial release.
2 changes: 1 addition & 1 deletion LICENSE.txt
@@ -1,4 +1,4 @@
Copyright (c) 2014, Intel Corporation All rights reserved.
Copyright (c) 2014-2015, Intel Corporation All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
Expand Down
15 changes: 10 additions & 5 deletions Makefile
@@ -1,4 +1,4 @@
# Copyright (c) 2014, Intel Corporation All rights reserved.
# Copyright (c) 2014-2015, Intel Corporation All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
Expand Down Expand Up @@ -31,28 +31,33 @@

all:
make -C src all
make -C pyMIC all
make -C pymic all
make -C examples all
make -C benchmarks all

tests: all
make -C tests tests

pep8check:
make -C pymic pep8check
make -C tests pep8check
make -C benchmarks pep8check

clean:
make -C src clean
make -C pyMIC clean
make -C pymic clean
make -C examples clean
make -C benchmarks clean
make -C tests clean

realclean:
make -C src realclean
make -C pyMIC realclean
make -C pymic realclean
make -C examples realclean
make -C benchmarks realclean
make -C tests realclean

tarball: all
make -C examples realclean
tar cfj pyMIC-`date +%F`.tbz2 ../pyMIC/{README.txt,pyMIC,include,examples}
tar cfj pymic-`date +%F`.tbz2 ../pymic/{README.txt,pymic,include,examples}

31 changes: 28 additions & 3 deletions README-developer.txt
Expand Up @@ -107,18 +107,43 @@ Then you should be able to run the Python application and do some offloads:



4. Debugging
4. Tracing & Debugging
-----------------------

If you are interested in what is going on inside the pyMIC module, you can
If you are interested in what is going on inside the pymic module, you can
choose from several options to get a more verbose output.

You can set the OFFLOAD_REPORT environment variable to request an offload
report from the Intel offload runtime. Please have a loop at the article at
https://software.intel.com/en-us/node/510366 to see what values are accepted
for the environment variable and what effect they have.

You can also set PYMIC_DEBUG to enable the debugging output of pyMIC. Here's
The pymic module also supports more specific tracing and debugging.


4.1. Tracing
-----------------------

As of release 0.3, pymic can collect a trace of all performance relevant calls
into the module. The trace consists of the called functions' name, timings,
argument list, and (if collected) the source code location of the invocation.

To enable tracing, set PYMIC_TRACE=1. Shortly before the program finishes, the
tracing information will be printed to stdout in typical Python syntax. You
can then run any desired analysis on the trace data.

For each trace record, pymic records its source code location of the invocation.
This is called "compact" format (PYMIC_TRACE_STACKS=commpact). If the full call
stack of the invocation is needed, PYMIC_TRACE_STACKS=full will collect the
full call stack from the call site of a pymic function up to the top of the
application code. You can turn of stack collection (to increase performance
while tracing) by setting PYMIC_TRACE_STACKS=none.


4.2. Debugging
-----------------------

You can set PYMIC_DEBUG to enable the debugging output of pymic. Here's
the list of accepted values and what effect they have. Please note that higher
levels include lower levels, that is, they increase verbosity of the output.

Expand Down
56 changes: 41 additions & 15 deletions README.txt
@@ -1,4 +1,4 @@
0. General Information
0. General Information
-----------------------

Maintainer: Michael Klemm, michael.klemm@intel.com, SSG-DRD EMEA HPC team
Expand All @@ -15,13 +15,14 @@ at https://lists.01.org/mailman/listinfo/pymic.

The two biggest limitations at this point are:

(1) pyMIC requires the data to be stored as numpy.array structures
(1) pymic requires the data to be stored as numpy.array structures

(2) the kernel needs to be written in C/C++ (and Fortran) and must be
compiled as a native shared object for KNC.


1. Requirements

1. Requirements
-----------------------

You need to have the following software packages:
Expand All @@ -34,32 +35,32 @@ You need to have the following software packages:



2. Setup
2. Setup
-----------------------

To compile the native parts of pyMIC, please see README-developer.txt.
To compile the native parts of pymic, please see README-developer.txt.

To prepare pyMIC, please follow these steps
To prepare pymic, please follow these steps

- $pymic is the base directory of the pyMIC checkout/download
- $pymic is the base directory of the pymic checkout/download

- load the environment of the Intel Composer XE (if it has not been loaded yet):

$> source /opt/intel/composerxe/bin/compilervars.sh intel64

- set the Python search path, so that Python find the pyMIC modules:
- set the Python search path, so that Python find the pymic modules:

$> export PYTHONPATH=$PYTHONPATH:$pymic/src

- you can set OFFLOAD_REPORT=<level> to see the offloads that are
triggered by pyMIC.
triggered by pymic.

- if you want to have even more fine-grained debugging output,
set the environment variable PYMIC_DEBUG=1.



3. Examples
3. Examples
-----------------------

There are a few (very few!) examples that you can use for your first steps. You
Expand Down Expand Up @@ -91,20 +92,45 @@ Then you should be able to run the Python application and do some offloads:

$> ./double_it.py

4. Debugging


4. Tracing & Debugging
-----------------------

If you are interested in what is going on inside the pyMIC module, you can
If you are interested in what is going on inside the pymic module, you can
choose from several options to get a more verbose output.

You can set the OFFLOAD_REPORT environment variable to request an offload
report from the Intel offload runtime. Please have a loop at the article at
https://software.intel.com/en-us/node/510366 to see what values are accepted
for the environment variable and what effect they have.

You can also set PYMIC_DEBUG to enable the debugging output of pyMIC. Here's
The pymic module also supports more specific tracing and debugging.


4.1. Tracing
-----------------------

As of release 0.3, pymic can collect a trace of all performance relevant calls
into the module. The trace consists of the called functions' name, timings,
argument list, and (if collected) the source code location of the invocation.

To enable tracing, set PYMIC_TRACE=1. Shortly before the program finishes, the
tracing information will be printed to stdout in typical Python syntax. You
can then run any desired analysis on the trace data.

For each trace record, pymic records its source code location of the invocation.
This is called "compact" format (PYMIC_TRACE_STACKS=commpact). If the full call
stack of the invocation is needed, PYMIC_TRACE_STACKS=full will collect the
full call stack from the call site of a pymic function up to the top of the
application code. You can turn of stack collection (to increase performance
while tracing) by setting PYMIC_TRACE_STACKS=none.


4.2. Debugging
-----------------------

You can set PYMIC_DEBUG to enable the debugging output of pymic. Here's
the list of accepted values and what effect they have. Please note that higher
levels include lower levels, that is, they increase verbosity of the output.

Expand Down
59 changes: 0 additions & 59 deletions TODO.txt

This file was deleted.

5 changes: 4 additions & 1 deletion benchmarks/Makefile
@@ -1,4 +1,4 @@
# Copyright (c) 2014, Intel Corporation All rights reserved.
# Copyright (c) 2014-2015, Intel Corporation All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
Expand Down Expand Up @@ -44,6 +44,9 @@ libbenchmark_kernels.so: benchmark_kernels.c
dgemm_c.x: dgemm_c.c
icc -openmp -L$(MKLROOT)/lib/intel64 -lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -lpthread -lm -g -O2 -o dgemm_c.x dgemm_c.c

pep8check:
pep8 --ignore=W293,W291 *.py

clean:
rm -f libbenchmark_kernels.so
rm -f dgemm_c.x
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/benchmark_kernels.c
@@ -1,4 +1,4 @@
/* Copyright (c) 2014, Intel Corporation All rights reserved.
/* Copyright (c) 2014-2015, Intel Corporation All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
Expand Down
12 changes: 8 additions & 4 deletions benchmarks/bind.py
@@ -1,6 +1,6 @@
#!/usr/bin/python

# Copyright (c) 2014, Intel Corporation All rights reserved.
# Copyright (c) 2014-2015, Intel Corporation All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
Expand Down Expand Up @@ -32,11 +32,14 @@
import sys
import time

import pyMIC as mic
import pymic
import numpy as np

benchmark = sys.argv[0][2:][:-3]

device = pymic.devices[0]
stream = device.get_default_stream()

nrepeat = 1000000
if len(sys.argv) > 1:
nrepeat = int(sys.argv[1])
Expand All @@ -46,8 +49,9 @@

timings = []
ts = time.time()
for i in range(nrepeat):
offl_a = device.bin(a)
for i in xrange(nrepeat):
offl_a = stream.bind(a)
stream.sync()
te = time.time()

try:
Expand Down

0 comments on commit e1428ff

Please sign in to comment.