<hr style="border-width:3px; border-color:coral"/>

# How many processes can I access? 

<hr style="border-width:3px; border-color:coral"/>

Before we start talking about "multiprocessing", we need to understand how many processes we have available on the machine we are working on.  

The command line gives us a few different ways to count the number of CPUs we have. 

In OSX : 

    $ sysctl -a | grep machdep.cpu.core_count
    machdep.cpu.core_count: 4
    
In Linux : 

    $ more /proc/cpuinfo | grep processor
    processor	: 0
    processor	: 1
    processor	: 2
    processor	: 3

Let's do this from within the Jupyter Notebook : 

In [3]:
import subprocess
import shlex

In [4]:
# For OSX : 
cmd = 'sysctl -a'
grep_arg = 'machdep.cpu.core_count'

# For Linux
# cmd = 'more /proc/cpuinfo'
# grep_arg = 'processor'

In [5]:
cmd_args = shlex.split(cmd)
output = subprocess.run(cmd_args,capture_output=True,text=True)
print(output.stdout,end='')

user.cs_path: /usr/bin:/bin:/usr/sbin:/sbin
user.bc_base_max: 99
user.bc_dim_max: 2048
user.bc_scale_max: 99
user.bc_string_max: 1000
user.coll_weights_max: 2
user.expr_nest_max: 32
user.line_max: 2048
user.re_dup_max: 255
user.posix2_version: 200112
user.posix2_c_bind: 0
user.posix2_c_dev: 0
user.posix2_char_term: 0
user.posix2_fort_dev: 0
user.posix2_fort_run: 0
user.posix2_localedef: 0
user.posix2_sw_dev: 0
user.posix2_upe: 0
user.stream_max: 20
user.tzname_max: 255
kern.ostype: Darwin
kern.osrelease: 18.7.0
kern.osrevision: 199506
kern.version: Darwin Kernel Version 18.7.0: Sun Dec  1 18:59:03 PST 2019; root:xnu-4903.278.19~1/RELEASE_X86_64
kern.maxvnodes: 263168
kern.maxproc: 2128
kern.maxfiles: 49152
kern.argmax: 262144
kern.securelevel: 0
kern.hostname: dcs-macbook.boisestate.edu
kern.hostid: 0
kern.clockrate: { hz = 100, tick = 10000, tickadj = 0, profhz = 100, stathz = 100 }
kern.posix1version: 200112
kern.ngroups: 16
kern.job_control: 1
kern.saved_ids: 1
kern.boottime: { sec 

Let's send (or 'pipe') the results of this command through `grep` and output only the information we care about. This will get us something more readable.

To do this, we specify that the `input` to the `subprocess.run()` command should be the output we had from above.

In [6]:
cmd_args = ['grep',grep_arg]
count_out = subprocess.run(args=cmd_args,capture_output=True,text=True,\
                            input=output.stdout)     # Input to this call is output from above
print(count_out.stdout,end='')

machdep.cpu.core_count: 4


If you are sure that you know how to interact with the underlying operating system and shell, you can use the `shell` version of the `subprocess.run()` command. You will still want to set `capture_output` and `text` to `True` to get the desired output.

In [7]:
cmd = 'sysctl -a | grep machdep.cpu.core_count'
# cmd = 'more /proc/cpuinfo | grep processor'

count_out = subprocess.run(cmd,capture_output=True,text=True,shell=True)
print(count_out.stdout,end='')              

machdep.cpu.core_count: 4


<hr style="border-width:3px; border-color:coral"/>

## Python `multiprocessing` module
<hr style="border-width:3px; border-color:coral"/>

A better way to count the number of processors available that does not depend on knowing archane commands for  the underlying shell is to use the Python `multiprocessing` module.  We will talk more about this in a subsequent notebook, but for now, we use this module to count how many processes we have. 

In [8]:
import multiprocessing

In [9]:
multiprocessing.cpu_count()

8

Why are we getting 8 instead of four?  

OSX uses a technology called 'hyperthreading' and in fact, each process has access to 2 threads, giving us 8 threads total.  The multiprocessing module seems to equate 'threads' with CPUs.  

To see that we have 8 threads, we can check the hardware with a command similar to the one used above : 

In [10]:
cmd = 'sysctl -a | grep machdep.cpu.thread_count'    # OSX
# cmd = 'more /proc/cpu_info | grep "physical id"'   # Linux

count_out = subprocess.run(cmd,capture_output=True,text=True,shell=True)
print(count_out.stdout,end='')     

machdep.cpu.thread_count: 8


For the purposes of multiprocessing, we can use all 8 threads.  As will see, however, this may have performance implications if we use all 8 threads rather than just 4 processes. 

Using the multiprocessing command at the Python prompt in Linux, I get

    >>> import multiprocessing
    >>> multiprocessing.cpu_count()
    4

On the same machine!  Go figure! 

In Linux, we can look at the physical id of each core, and we see that we get : 

    $ more /proc/cpuinfo | grep "physical id"
    physical id	: 0
    physical id	: 2
    physical id	: 4
    physical id	: 6

which suggests that physical id's 0 and 1 are on processor 0, 2 and 3 are on processor 1 and so on.  In fact, checking 'siblings', we get

    $ more /proc/cpuinfo | grep -A 1 "physical id"
    physical id : 0
    siblings    : 1
    --
    physical id : 2
    siblings    : 1
    --
    physical id : 4
    siblings    : 1
    --
    physical id : 6
    siblings    : 1