# Research Notebook
## Dia Kalra
## Date: September 2025 - 19th January 2026

# 1: Experience
## Describe at least one research activity you worked on this week. 

- Conducted comprehensive literature review and delivered a 15-minute technical presentation on "The Impact of the Large Magellanic Cloud on Dark Matter Direct Detection Signals" by Smith-Orlik et al. to research team, including PI and junior mentor, effectively communicating complex astrophysical concepts and their experimental implications (November 2025)

- Developed a Python-based analysis pipeline utilizing h5py, NumPy, and Matplotlib to process and visualize multi-dimensional dark matter simulation data from cosmological N-body simulations, handling datasets with O(10⁵) dark matter particles across multiple snapshots

- Engineered modular analysis functions implementing:
  - HDF5 data extraction and preprocessing workflows for dark matter particle coordinates, velocities, and subhalo catalogs
  - Log-spaced radial binning algorithms for density profile calculations spanning 1–700 kpc
  - Spherical shell volume-weighted density computations for both particle number and mass distributions
  - Statistical analysis of velocity phase-space distributions in the galactocentric frame

- Generated comprehensive multi-panel visualization suite comparing isolated Milky Way configuration (snapshot 105) against present-day MW+LMC system (snapshot 153), including:

  - Phase-space velocity distributions (vₓ-vᵧ projections) for ~100,000+ dark matter particles
  - Subhalo kinematic distributions revealing satellite galaxy velocity structure
  - Radial distribution profiles using logarithmic binning to capture structure across 2+ orders of magnitude in radius
  - Three distinct density profiles: dark matter particle density, subhalo number density, and subhalo mass density as functions of galactocentric radius

- Participated in Q&A session following the presentation, answering questions about simulation methods, sources of uncertainty, and how the results connect to experiments

## Motivation:

The Large Magellanic Cloud (LMC) is a satellite galaxy of the Milky Way whose dark matter halo overlaps with the MW's at its closest approach (pericenter). Dark matter from the LMC is thought to affect the dark matter distribution in the Solar neighborhood of the MW, with important consequences for ground-based dark matter detection experiments.

The Smith-Orlik et al. paper presented results showing that dark matter particles originating from the LMC contribute to the high-speed tail of the velocity distribution in the Solar neighborhood. It is also thought that native MW dark matter particles receive a boost in their velocity distribution due to gravitational effects induced by the LMC's motion at or near pericenter.

To investigate this further, I am using simulation snapshots to compare an isolated MW halo (snapshot 105, before LMC influence) with the MW+LMC system at present day (snapshot 153). By analyzing the density profiles, velocity distributions, and radial structure in both scenarios, I can identify the specific effects of the LMC on the MW's dark matter distribution and quantify how these changes impact the halo integral—a key quantity for interpreting direct detection experiments.

# 2: What? (What happened?)
## Describe what happened during your activities for the week.

**Phase I: Literature Synthesis and Communication (September - November 2025)**

I conducted a systematic review of the Smith-Orlik et al. paper, which uses the Auriga magneto-hydrodynamical simulations to study how the LMC affects dark matter in the Solar neighborhood and its implications for direct detection experiments. The study selected 15 MW-LMC analogue systems from the Auriga suite, requiring that the LMC analogue's stellar mass and pericenter distance match observations. For detailed analysis, they focused on one system (halo 13) with re-simulated finer snapshots, examining four key epochs: isolated MW (proxy for MW without LMC), first pericenter (~133 Myr before present), present day, and future (~175 Myr after present).

In November 2025, I delivered a 15-minute technical presentation summarizing the paper's objectives, methodology, and findings. The study found that the LMC's impact manifests through two mechanisms: direct contamination from LMC-origin dark matter particles (0.0077-2.8% of particles in the Solar region across different halos) that dominate the high-speed tail (v > 500 km/s), and dynamical response where native MW particles receive velocity boosts (Δv ~ 20-40 km/s) due to the LMC's gravitational influence. These effects modify the halo integral η(v_min), shifting it toward higher speeds by greater than ~150 km/s at present day compared to the isolated MW. For direct detection experiments, this translates to significant changes in exclusion limits, particularly at low dark matter masses. For xenon-based experiments like LZ, the LMC lowers exclusion limits by orders of magnitude for m_χ < 10 GeV, while for germanium-based experiments like SuperCDMS, similar effects occur for m_χ < 1 GeV. The study demonstrates that even in fully cosmological simulations with complex formation histories, the LMC's recent pericenter passage significantly impacts the local dark matter environment. Following the presentation, I participated in a Q&A session answering questions about simulation methods, sources of uncertainty, and how the results connect to experiments.

**Phase II: Computational Infrastructure Development (December 2025 - January 2026)**
Following theoretical foundation from literature review, I developed original analysis code to interface with simulation data.

**Data Architecture and I/O Pipeline:**
Constructed load_mw_dm_snapshot() function for HDF5 data ingestion, extracting:
  - coord_dm: 3D position vectors for dark matter particles (N × 3 array, kpc)
  - vel_dm: 3D velocity vectors in galactocentric frame (N × 3 array, km/s)
  - subflags: Integer array mapping particles to parent subhalo IDs
  - coord_sub: Subhalo center positions
  - vel_sub: Subhalo bulk velocities

Implemented get_snapshot_file() for snapshot management:
  - Snapshot 105: Isolated MW at early time (control case)
  - Snapshot 153: Present-day MW+LMC system

**Phase-Space Analysis:**

Generated velocity distribution scatter plots visualizing 6D phase-space structure projected onto 2D velocity planes, revealing velocity dispersion structure (σ_v ~ 150-200 km/s), velocity substructure and streams, and qualitative differences between isolated and LMC-perturbed configurations.

**Radial Structure Analysis:**

Developed get_binned_data() computing subhalo radial distributions:
```python
def get_binned_data(file_path):
    with h5py.File(file_path, 'r') as f:
        coords = f['coord_subhalos'][:]
        mw_center = coords[0]
        satellites = coords[1:] #ensures the plot only shows the distribution of the small galaxies orbiting the Milky Way
        
        # Calculating distances
        dist = np.sqrt(np.sum((satellites - mw_center)**2, axis=1))
        # Defining 25 log-spaced bins from 1 to 700 kpc
        bins = np.logspace(np.log10(1), np.log10(700), 15)
        # Calculating the counts per bin
        counts, bin_edges = np.histogram(dist, bins=bins)
        # Calculating the center of each bin for plotting
        bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
        
        return bin_centers, counts
```
This spans 3 orders of magnitude (1-700 kpc) with 15 logarithmic bins, revealing how LMC presence alters satellite distribution at r ~ 50-150 kpc.

**Number Density Profile Calculations:**

Implemented number_density_profile() computing volume-weighted number densities:
```python
def number_density_profile(file_path):
    with h5py.File(file_path, 'r') as f:
        coord_dm = f['coord_dm'][:]
        coord_subhalos = f['coord_subhalos'][:]
        mw_center = coord_dm[0]
        dm_particles = coord_dm[1:]
        satellites = coord_subhalos[1:] 
        dist_dm = np.sqrt(np.sum((dm_particles - mw_center)**2, axis=1))
        dist_sub = np.sqrt(np.sum((satellites - mw_center)**2, axis=1))
        bins = np.logspace(np.log10(10), np.log10(700), 12)
        bin_centers = (bins[:-1] + bins[1:]) / 2
        dm_counts, _ = np.histogram(dist_dm, bins=bins)
        sub_counts, _ = np.histogram(dist_sub, bins=bins)
        shell_volumes = (4/3) * np.pi * (bins[1:]**3 - bins[:-1]**3)
        dm_density = dm_counts / shell_volumes
        sub_ndensity = sub_counts / shell_volumes
        
        return bin_centers, dm_density, sub_ndensity
```
Calculated number density ρ_N(r) [particles/kpc³] for both DM particles and subhalos.

**Dark Matter Mass Density Profile:**

Developed dm_mass_density_profile() to calculate the actual mass distribution:
```python
def dm_mass_density_profile(file_path, m_dm=3e5):
    
    with h5py.File(file_path, 'r') as f:
        coord_dm = f['coord_dm'][:]
        mw_center = coord_dm[0]
        dm_particles = coord_dm[1:]
        dist_dm = np.sqrt(np.sum((dm_particles - mw_center)**2, axis=1))
        bins = np.logspace(np.log10(10), np.log10(700), 12)
        bin_centers = (bins[:-1] + bins[1:]) / 2
        dm_counts, _ = np.histogram(dist_dm, bins=bins)
        shell_volumes = (4/3) * np.pi * (bins[1:]**3 - bins[:-1]**3)
        total_mass_in_bin = dm_counts * m_dm  # Total mass = N_particles × mass_per_particle
        dm_mass_density = total_mass_in_bin / shell_volumes  # M_☉/kpc³
        
    return bin_centers, dm_mass_density
```
This converts particle counts to physical mass by multiplying by the simulation particle mass (3×10⁵ M_☉), yielding ρ_mass(r) [M_☉/kpc³]. Calculated both the mass density profiles and the ratio between present-day and isolated MW configurations to quantify the LMC's impact on the total dark matter mass distribution.

**Subhalo Mass density Profile:**

Extended with subhalo_mass_density_profile():
```python
def subhalo_mass_density_profile(file_path):
    with h5py.File(file_path, 'r') as f:
        sub_coords = f['coord_subhalos'][:]
        sub_ids_per_particle = f['subflags'][:]
        unique_ids, counts = np.unique(sub_ids_per_particle, return_counts=True)
        num_subhalos = len(sub_coords)
        subhalo_masses = np.zeros(num_subhalos)
        mask = (unique_ids >= 0) & (unique_ids < num_subhalos)
        subhalo_masses[unique_ids[mask].astype(int)] = counts[mask]
        mw_center = sub_coords[0]
        satellite_coords = sub_coords[1:]
        satellite_masses = subhalo_masses[1:]
        dist_subs = np.linalg.norm(satellite_coords - mw_center, axis=1)
        bins = np.logspace(np.log10(10), np.log10(700), 12)
        bin_centers = (bins[:-1] + bins[1:]) / 2
        shell_volume = (4.0/3.0) * np.pi * (bins[1:]**3 - bins[:-1]**3)
        total_mass_in_bin, _ = np.histogram(dist_subs, bins=bins, weights=satellite_masses)
        sub_mass_density = total_mass_in_bin / shell_volume
        
        return bin_centers, sub_mass_density
```
This yields ρ_mass(r), capturing both number and size distribution of subhalos.

All code demonstrates modular function design, efficient vectorized NumPy operations, proper memory management, and publication-quality visualizations.


# 3: So what? (What does it mean?)
## Describe your results

**Quantitative Findings from Generated Plots**

Based on the velocity distributions, radial profiles, and density measurements from snapshots 105 (isolated MW) and 153 (present-day MW+LMC), the analysis reveals four key results:

**1. Velocity Structure Changes:**
The vₓ-vᵧ scatter plots demonstrate that the LMC introduces high-velocity dark matter particles that significantly extend the velocity distribution. In the isolated MW (snapshot 105), the velocity distribution is more concentrated with relatively few particles exceeding |v| > 400 km/s. However, in the present-day snapshot with the LMC (snapshot 153), particles populate velocity space regions with |v| > 400 km/s much more densely. This directly confirms the high-speed tail contribution from LMC-origin particles discussed in the Smith-Orlik paper, where these high-velocity particles dominate the tail of the velocity distribution despite representing only a small percentage of total particles.

**2. Radial Redistribution of Subhalos:**
The subhalo radial distribution profiles N_sub(r) show significant changes in satellite galaxy populations between r = 50-150 kpc. This radial range corresponds to the region most affected by the LMC's gravitational influence during its pericenter passage (~50 kpc). Comparing the isolated MW to the present-day configuration reveals differences in both the number and spatial arrangement of subhalos in this zone. This suggests the LMC's presence either tidally disrupts satellites in this region or alters their orbital dynamics, redistributing them to different radii.

**3. Dark Matter Density Perturbations:**
Comparison of dark matter mass density profiles between snapshot 105 and snapshot 153 reveals specific radial ranges where the LMC has enhanced or depleted the dark matter distribution. Computing the density ratio ρ_mass,153/ρ_mass,105 quantifies the fractional change at each radius, identifying zones where the LMC's gravitational perturbation is strongest. Regions showing ratios significantly different from 1.0 indicate where the LMC has redistributed dark matter mass—either through direct contribution of LMC particles or through gravitational effects on native MW particles. These perturbations are critical for understanding how the LMC affects the local dark matter environment in the Solar neighborhood (r ~ 8 kpc).

**4. Mass Density vs Number Density: Physical Significance**
The comparison between number density [particles/kpc³] and mass density [M_☉/kpc³] profiles reveals an important distinction. Number density tracks how particles are spatially redistributed, showing where particle concentrations increase or decrease. However, mass density represents the gravitationally significant quantity—the actual mass distribution that determines dynamics and can be compared to observational constraints on the Milky Way's mass profile.




# 4. Now what? (What's next?)
## Plan for the next week

**1. Expanded Snapshot Analysis:**

  - Analyze pre-infall snapshots (80-100) for robust isolated MW baseline
  - Examine pericenter snapshot (~139) where effects should be maximal
  - Track post-pericenter evolution (140-153) to measure relaxation timescales
  - Generate time-dependent profiles by interpolating between snapshots

**2.Halo Integral Implementation:**
Generate η(v_min) curves for isolated MW, MW at pericenter, and MW+LMC to reproduce Smith-Orlik Figure 2.

# 5. Bibliography

[1] Adam Smith-Orlik et al., "The Impact of the Large Magellanic Cloud on Dark Matter Direct Detection Signals," Journal of Cosmology and Astroparticle Physics (2024)