# Memory Profile Notes

## W

This function does not have significant memory usage. 

    22  114.766 MiB -378.930 MiB         901   	r = np.sqrt(x**2 + y**2 + z**2)

This line allocates -378.930 MiB of memory, perhaps this calculation could be optimized more. 

    24  115.391 MiB -1152.188 MiB         901   	w = (1.0 / (h*np.sqrt(np.pi)))**3 * np.exp( -r**2 / h**2)

This line could be optimized more using in place optimizations. 

## gradW

This function uses even more memory allocations. 

    39  114.289 MiB  348.824 MiB         301   	r = np.sqrt(x**2 + y**2 + z**2)

This line increases memory allocation by 348.824 MiB, since it creates at least 4 separate tempory arrays in memory. This could be optimized to use less memory.

    41  115.559 MiB  365.629 MiB         301   	n = -2 * np.exp( -r**2 / h**2) / h**5 / (np.pi)**(3/2)

This line increases memory allocation by 365.629 MiB, since it also creates lots of temporary arrays. 

    42  116.734 MiB  329.164 MiB         301   	wx = n * x
    43  117.957 MiB  348.512 MiB         301   	wy = n * y
    44  119.180 MiB  346.652 MiB         301   	wz = n * z

These each increase memory allocation by creating large arrays (size x, y, and z)

## getPairwiseSeparations

This function seems to use significant amounts of memory. on line 57-68 there is no significant allocations so the memory is somewhat stable at this point. These lines only reshape the data that is already allocated. 

On line 71: memory allocated to 1035.637 MiB
On line 72: memory allocated to 682.832 MiB
On line 73: memory allocated to 313.457 MiB

This is a significant increase of memory allocation and has potential to be optimized. 

    70  # matrices that store all pairwise particle separations: r_i - r_j
    71  110.551 MiB 1035.637 MiB        1202   	dx = rix - rjx.T
    72  111.773 MiB  682.832 MiB        1202   	dy = riy - rjy.T
    73  112.996 MiB  313.457 MiB        1202   	dz = riz - rjz.T

## getDensity

There is a significant jump in the memory when this function calls getPairwiseSeparations() by 2154.090 MiB. 

## getPressure

There does not seem to be any increase in memory usage in this function. 

## getAcc

The getAcc function calls both getPairwiseSeparations() and gradW() which contribute to an increase in memory by 1093.688 MiB and 1099.066 MiB respectively. 

We need to look at getPairwiseSeparations and gradW to optimize (perhaps using fload32 instad of float64 if precision allows) and using vectorizing computations to avoid using large intermediate arrays. 

## main

There's a lot going on in main so this might be a lot to look at for optimization, especially since some memory allocations end up being negative numbers. 

   172  105.410 MiB    0.020 MiB           1   	np.random.seed(42)            # set the random number generator seed

   174  105.445 MiB    0.035 MiB           1   	lmbda = 2*k*(1+n)*np.pi**(-3/(2*n)) * (M*gamma(5/2+n)/R**3/gamma(1+n))**(1/n) / R**2  # ~ 2.01

This lmbda calculation adds some memory allocation. 

   176  105.469 MiB    0.023 MiB           1   	pos   = np.random.randn(N,3)   # randomly selected positions and velocities

   177  105.480 MiB    0.012 MiB           1   	vel   = np.zeros(pos.shape)

Calculating np.random and np.zeros add some memory allocations. 

   180  105.664 MiB    0.184 MiB           1   	acc = getAcc( pos, vel, m, h, k, n, lmbda, nu )

the getAcc() method allocates .184 MiB

   186  106.184 MiB    0.512 MiB           1   	fig = plt.figure(figsize=(4,5), dpi=80)
   188  106.562 MiB    0.379 MiB           1   	ax1 = plt.subplot(grid[0:2,0])
   189  106.754 MiB    0.191 MiB           1   	ax2 = plt.subplot(grid[2,0])

plotting the figure takes memory, consider preallocating the axes before the loop to reduce redundant operations. Also consider caching density values to not keep calling it over and over again. 
