# Problem 1: LLMs

### HW1 Problem 2

I gave Claude 3.7 Sonnet the exact text of HW1 problem 2 (with the removal of unnecessary details). I have saved its generated code as 'problem1.1_claude.py'. I have made absolutely no edits to the code. On the first prompt, Claude understood the assignment and implemented a well-worked solution. However, it used a rejection-sampling techinuqe for generating velocities and gave a maximum number of iterations before it simply assigned the particle the mean energy. This resulted in almost every particle being given the mean energy. 

Rather than let that instance run to completion, I reprompted Claude with "Can you find a way to make the velocity sampling more efficient? It is reaching the maximum attempts for almost every particle. This is likely because the psudo-powerlaw shape means it is extremely unlikely that acceptances will happen at high energies." It immediately decided to swap to a numerically integrated CDF technique to generate the samples. This second iteration of the code ran in 234.39 seconds according to its built in timer and saved a plot and data files. Although the code and description of the approach look very good, the resulting plot, which I've renamed 'velocity_dispersion_v2.png' shows that the code does not recover the correct velocity dispersion profile. 

I reprompted Claude with "This code is not producing correct results. The sampled result is far greater than the theoretical result in the plot. Make sure you're using Eq. 10 from Hernquist as the theoretical prediction. Also, it seems like you're taking the standard deviation in radial bins to get the velocity dispersion. You should just be taking the mean of the squared radial velocities in each bin instead." Claude claimed to make the requested changes along with adding new numerical safeguards and efficieny improvements related to the sampling. It also said it updated the minimum energy at every radius, which if it wasn't doing before would have been a significant bug. I renamed the resulting velocity dispersion plot 'velocity_dispersion_v3.png'. The theoretical curve is now much worse and the match is terrible. The sample might be okay, although it doesn't fall off fast enough at large radii it seems. 

I prompted it again asking it to fix the theoretical result, use more bins, and to only worry about radii less than 10^4 kpc. The result was actually worse than before. This time the velocity dispersion plot is labeled with v4.

I spent a while looking through Claude's code and mine and found that there are issues in the energy range Claude's code samples, and it is back to using the wrong version of Eq 10 from the Herquist paper. I tried to prompt it to fix this, but I exceeded the allowed context window and needed to start a new chat. 

I made those two changes and the result was much better. The shape of the sample is correct now, but there remains a normalization issue. I'll consider this a near success for Claude. It got the structure of the code correct and was most of the way there, but needed a bit of help to get over the finish line. Unfortunately, that meant I had to go through its code to debug and would still have to do more if I wanted to solve the problem perfectly. 

The solution it wrote follows the same approach as mine generally, but through very different methods. It compartmentalizes its code similar to me, using separate functions for all the simple equations. Its code is generally tractable, with good commenting. 


### HW2 Problem 1

This time instead of giving Claude the exact problem text, I gave it a more streamlined version: "Can you write Python code to solve the Kepler problem with a central mass of 1 M_sun,  a=1AU, and e=0.96 using 4th order Runge-Kutta for the integration. Integrate the orbit for 1 year and initialize the particle at perihelion."

Claude turned out a very sophisticated code that tracks not only the orbit but also the energy and angular momentum conservation. Claude also recognized the need for small timesteps to account for the fast motion of the highly eccentric orbit. It both implemented a small time step and explained why in its accompanying response. It also included the generation of an animation of the orbit in its code, although it commented out the creating of the animation in the main functino. The code it wrote is in 'problem_2.1_claud.py'. 

I reprompted Claude to have it save the plot and animation as files. The code seems to be an excellent solution to the problem, needing no significant modifications as far as the science is concerned. It had errors in the animation function, but after a couple of adjustments, it manged to find a way that worked. 

This problem was a massive success for Claude. It solved in on the very first prompt and went above and beyond in its implementation. The solution it wrote is somewhat different than my implementation. It wrote far more code than I did, for one. It seems much more interested in clarity than efficiency, in terms of code length. It created a plot very similar to what I ended up making, which I found very surprising. It couldn't implement the solution all that different and still do it right, but it clearly approaches the coding far differently than me. 


### Hogg and Foreman-Mackey Problem 2

For this exercise, I used ChatGPT's o4-mini model. I decided to try using its "reasoning" feature. I gave it the exact text of the problem from the paper. It output a plot of the solution as the question requests. In fact, to get the code, I had to manually choose to see its "analysis". The code seems to solve the problem exactly as requested. It is efficient, well-commented, and produces the plot as requested. No modification or reprompting was needed. 

I tried ChatGPT without the reasoning, and this time it didn't show the solution plot it had generated but rather gave the code and explanation like normal. That code also produces the correct result on the first attempt. This code is slightly less efficient and produces a slightly nicer looking plot. It also chose not to implement its own Gaussian density function.

This is a rather simple code, so mine and ChatGPT's versions are very similar. I had written versions that looked like both the standard GPT response and the more efficient reasoning response. There wasn't too much room for flexibility here, but the code it wrote was exceedingly normal. It didn't do anything super fancy or opaque. 


### Conclusions

The LLMs did an overall impressive job solving these problems from the homeworks. The very first problem was by far the most complicated we faced, and the AI struggled with it the most as well. It couldn't replace me for that problem because not only did it requrie multiple re-prompts, it never got to a solution that didn't require debugging from me. 

For the other 2 problems, they were plenty simple for the AIs to easily turn out sophisticated solutions. If I wanted, I could easily have used AI to do those problems in seconds, and it would have been relatively hard to determine that they were written by LLMs. The code was fairly human, and small edits to syntax and commenting would have made it pretty convincingly mine. Of course, since I could just copy paste the problem text or basically explain it, I didn't need more than the slightest understanding of the problem or the content to get effective solutions. For the MCMC problem, I could have made the requested plot without knowing anything because I literally copy pasted and got the plot back immediately (without even seeing the code in the case of the reasoning model). 



# Problem 2: Periodic Boundary Conditions

In HW3, I wrote the algorithm for calculating the gravitational accelerations on a catalog of particles using a brute force approach. That function did not take advantage of the optimizations of Numpy arrays. Therefore, I rewrote my brute force approach in addition to adding the option for periodic boundary conditions.

In [1]:
import numpy as np
import matplotlib.pyplot as pl

In [2]:
def brute_force_acc(cat, ms=None, eta=0, periodic_bc=False, boxsize=None, G=1):
    
    acc = np.empty(cat.shape)

    n = len(cat)
    dims = cat.shape[1]

    if ms is None:
        ms = np.ones(n)[:,None,None]
    else:
        ms = ms[:,None,None]

    if boxsize is None:
        boxsize = int(max(cat))+1

    coord_dists = np.zeros((n,n,dims))

    for i in range(dims):
        coord_dists[:,:,i] = cat[:,i:i+1] - cat[:,i:i+1].T
        
        if periodic_bc:
            halfL = boxsize / 2
            coord_dists[:,:,i][coord_dists[:,:,i]>halfL] = coord_dists[:,:,i][coord_dists[:,:,i]>halfL] - boxsize
            coord_dists[:,:,i][coord_dists[:,:,i]<halfL] = coord_dists[:,:,i][coord_dists[:,:,i]<halfL] + boxsize


    denom = np.sum(coord_dists**2, axis=2) + eta**2
    nonzeros = (denom > 0)
    denom[nonzeros] = denom[nonzeros]**(-1.5)

    acc = -G * ms * denom[:, :, None] * coord_dists
    acc = np.sum(acc, axis=0)
    return acc

In [66]:
def brute_acc_old(cat, eta=0):
    '''
    Brute force n-body acceleration function

    Parameters
    ----------
    cat: catalog of object coordinates
    f: acceleration function
    eta: softening factor

    Returns
    --------
    nxd array of acceleration on all n particles in all d dimensions
    '''
    acc = np.zeros(cat.shape)

    #double loop to consider force of every particle on every other
    for i in range(len(cat)):
        for j in range(i):
            a = (np.sum((cat[j]-cat[i])**2)+eta**2)**(-1.5)*(cat[j]-cat[i])

            #Newton's third law says we don't have to revisit this pairing in reverse
            acc[i] -= a
            acc[j] += a
    
    return acc

In [3]:
rng = np.random.default_rng(2)
dat = rng.random((10000,3))*1000

In [6]:
a1 = brute_force_acc(dat, boxsize=1000)

In [72]:
a2 = brute_acc_old(dat)

In [74]:
print(a1-a2)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 ...
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


I reworked my old code slightly so it doesn't use any other functions to make it easier to paste here without clutter. The first test was a sanity check to make sure that my new implementation using fancy Numpy vectorization achieves the same result as the old, naieve algorithm. The output above shows that the results are identical, except that the old algorithm ran for 2 minutes 45 seconds and the new algorithm ran in 7.3 seconds. 

My new function not only runs faster, but allows for periodic boundary conditions. I implemented the "minimum image convention" laid out [here](https://rwexler.github.io/comp-prob-solv/lecture-16-tech-details.html). 

In [7]:
a11 = brute_force_acc(dat, boxsize=1000, periodic_bc=True)

In [11]:
print(np.sum((a11-a1)>0))

print(dat[abs(np.sum(a11-a1, axis=1)) < 1e-5][:10])

12581
[[274.89590071 766.45420008 357.69819844]
 [ 30.86154458 938.0372885  151.18293521]
 [168.94833016 396.75096739 675.5774014 ]
 [162.16598222 611.87997058 539.68288764]]
