OpenNucleome for high-resolution nuclear structural and dynamical modeling

The intricate structural organization of the human nucleus is fundamental to cellular function and gene regulation. Recent advancements in experimental techniques, including high-throughput sequencing and microscopy, have provided valuable insights into nuclear organization. Computational modeling has played significant roles in interpreting experimental observations by reconstructing high-resolution structural ensembles and uncovering organization principles. However, the absence of standardized modeling tools poses challenges for furthering nuclear investigations. We present OpenNucleome—an open-source software designed for conducting GPU-accelerated molecular dynamics simulations of the human nucleus. OpenNucleome offers particle-based representations of chromosomes at a resolution of 100 KB, encompassing nuclear lamina, nucleoli, and speckles. This software furnishes highly accurate structural models of nuclear architecture, affording the means for dynamic simulations of condensate formation, fusion, and exploration of non-equilibrium effects. We applied OpenNucleome to uncover the mechanisms driving the emergence of ‘fixed points’ within the nucleus—signifying genomic loci robustly anchored in proximity to specific nuclear bodies for functional purposes. This anchoring remains resilient even amidst significant fluctuations in chromosome radial positions and nuclear shapes within individual cells. Our findings lend support to a nuclear zoning model that elucidates genome functionality. We anticipate OpenNucleome to serve as a valuable tool for nuclear investigations, streamlining mechanistic explorations and enhancing the interpretation of experimental observations.


Introduction
The highly complex structural organization of the human nucleus plays a crucial role in the functioning and regulation of our cells (Dekker et al., 2017;Hübner et al., 2013;Bickmore, 2013;Gorkin et al., 2014;Dekker and Mirny, 2016;Furlong and Levine, 2018;Finn and Misteli, 2019;Chen and Belmont, 2019;Lin et al., 2021;Liu et al., 2024).The complexity arises from the diverse range of nuclear landmarks, such as nucleoli (Lafontaine et al., 2021), nuclear speckles (Chen and Belmont, 2019;Lamond and Spector, 2003), and the nuclear lamina (van Steensel and Belmont, 2017), each serving distinct functions.These landmarks provide specialized environments for various nuclear processes, allowing for efficient coordination and regulation of gene expression.Moreover, the spatial arrangement of chromosomes within the nucleus, intertwined with the nuclear landmarks, is critical for proper gene regulation and communication between different genome regions.Disruptions or abnormalities in the nuclear organization can have profound consequences on cellular function and can contribute to the development of diseases, including cancer and genetic disorders (Seruga et al., 2008;Schuster-Böckler and Lehner, 2012).
To complement these sequencing approaches, microscopic imaging techniques directly probe the spatial positions within individual nuclei (Bickmore, 2013;van Steensel and Belmont, 2017;Chen et al., 2015;Boettiger et al., 2016;Shachar et al., 2015).Recent advancements in DNA FISH (fluorescence in situ hybridization) have enabled high-throughput imaging of thousands of loci simultaneously (Su et al., 2020;Takei et al., 2021).These imaging studies have not only confirmed the structural features observed through sequencing techniques but have also provided valuable insights into the heterogeneity present at the single-cell level.
Despite the progress made in computational modeling, the absence of well-documented software with easy-to-follow tutorials pose a challenge.Many research groups develop their own independent software, which complicates cross-validation and hinders the establishment of best practices for genome modeling (Fujishiro and Sasai, 2022;Yildirim et al., 2023;Oliveira Junior et al., 2021).Moreover, comprehensive models of the entire nucleus, especially at high resolution, remain scarce.Addressing these limitations and fostering collaboration in the scientific community can be achieved through the development of open-source tools.By promoting transparency and accessibility, such tools have the potential to greatly facilitate nuclear modeling and contribute to a more unified and collaborative research environment.
We present OpenNucleome, an open-source software designed for conducting molecular dynamics (MD) simulations of the human nucleus.This software streamlines the process of setting up whole nucleus simulations through just a few lines of Python scripting.OpenNucleome can unveil intricate, high-resolution structural and dynamic chromosome arrangements at a 100KB resolution.It empowers researchers to track the kinetics of condensate formation and fusion while also exploring the influence of chemical modifications on condensate stability.Furthermore, it facilitates the examination of nuclear envelope deformation's impact on genome organization.The software's modular architecture enhances its adaptability and extensibility.Leveraging the power of OpenMM (Eastman et al., 2017), a GPU-accelerated MD engine, OpenNucleome ensures efficient simulations.
Our work demonstrates the fidelity of the simulated nuclear organizations by faithfully reproducing Hi-C, Lamin B DamID, TSA-Seq, and DNA-MERFISH data.The dynamic insights extracted from this model are pivotal in advancing our understanding of nuclear organization mechanisms.Our findings reveal that inherent heterogeneity in chromosome contacts naturally emerges within single cells.Interestingly, robust contacts between chromosomes and nuclear bodies can also be established due to a coupled self-assembly mechanism.Notably, the resilience of contacts involving nuclear bodies supports a nuclear zoning model for genome function.In the realm of nuclear investigations, we anticipate OpenNucleome to serve as an invaluable tool, seamlessly complementing experimental techniques.

Non-equilibrium nucleus model at 100 KB resolution
We present an open-source implementation of a computational framework that facilitates the structural and dynamical characterization of the human nucleus.This framework builds upon a previous investigation but incorporates several significant modifications.Firstly, we enhance the model resolution by a factor of 10, enabling the precise determination of the spatial positioning of each chromatin segment measuring 100KB in length.Secondly, we present a kinetic scheme for speckles that accounts for the phosphorylation of protein molecules.This inclusion captures the influence of chemical reactions on the stability and dynamics of nuclear bodies.Thirdly, we incorporate explicit nuclear envelope dynamics to explore the impact of large-scale deformations on genome organization.Finally, our implementation into OpenMM offers the advantages of Python Scripting and GPU acceleration, facilitating easy extension and customization.These features will facilitate the broad applicability and adoption of the proposed model.
The nucleus model provides particle-based representations for chromosomes, nucleoli, speckles, and the nuclear envelope.As shown in Figure 1A and B, each of the 46 chromosomes is represented as a beads-on-a-string polymer, where each bead represents a 100-KB-long genomic segment.Based on Hi-C data, we further assign each bead as compartment A, B, or C to signify euchromatin, heterochromatin, or pericentromeric regions.The lamina was modeled as a spherical enclosure with 10 µm diameter, using discrete particles arranged to represent a mesh grid with covalent bonds linking together nearest neighbors (Strom et al., 2021).We modeled nucleoli and speckles as liquid droplets that emerge through the spontaneous phase separation of coarse-grained particles, representing protein and RNA molecule aggregates (Chen and Belmont, 2019;Lafontaine et al., 2021).These particles exhibited attractive interactions within the same type to promote condensation.More details about the various components of the system can be found in the Appendix 1, section 'Components of the whole nucleus model'.
The energy function of the nucleus model includes three components that account for the selfassembly of chromosomes, the assembly of nuclear bodies, and the coupling between chromosomes and nuclear landmarks.Therefore, the model approximates nuclear organization as a coupled selfassembly process.The chromosome energy function (see Equation 7 in Appendix 1, section 'Hi-C inspired interactions for the diploid human genome') includes terms that account for the polymer connectivity and excluded volume effect, an ideal potential, compartment-specific interactions, and specific interchromosomal interactions.As shown in Figure 1C, the ideal potential is only applied for beads from the same chromosome to approximate the effect of loop extrusion by Cohesin molecules (Sanborn et al., 2015;Fudenberg et al., 2016) for chromosome compaction and territory formation (Di Pierro et al., 2016;Zhang and Wolynes, 2017).Compartment-specific interactions, on the other hand, promote microphase separation and compartmentalization of euchromatin and heterochromatin.Finally, interchromosomal interactions account for sequence-specific effects that compartmentdependent potentials cannot capture.
Interactions among coarse-grained particles that form nuclear bodies were designed to promote and stabilize the formation of liquid droplets, as has been revealed by many experiments (Handwerger et al., 2005;Caragine et al., 2018;Caragine et al., 2019).We adopted the Lennard-Jones potential for nucleolar particles to mimic the weak, multivalent interactions that arise from protein and RNA molecules that make up the nucleoli.As a first attempt to approximate their complex dynamics, we considered two types of particles that form speckles: phosphorylated (P) and de-phosphorylated (dP).The two types can interconvert via chemical reactions (Brackley et al., 2017;Söding et al., 2020;Carrero et al., 2006) and dP particles share attractive interactions modeled with the Lennard-Jones potential.
As shown in Figure 1C, to recognize specific interactions between chromosomes and nuclear landmarks, we introduced contact potentials between them.These potentials are inspired by the experimental techniques that probe the corresponding contacts.Appendix 1, sections 'Chromosome-nuclear landmark interactions' and 'Nuclear landmark-nuclear landmark interactions' contain more details about all the nuclear landmark-related energy functions.

Optimization of model parameters with experimental data
The nucleus model was designed to be interpretable such that energy terms represent physical processes.Furthermore, the expressions of the interaction potentials were also designed such that their parameters can be determined from experimental data via the maximum entropy optimization algorithm (Lin et al., 2021;Xie and Zhang, 2019;Schuette et al., 2023).Below, we briefly outline the procedure used for parameter optimization and further details can be found in Appendix 1, section 'Optimization of the whole nucleus model parameters.
As illustrated in Figure 2, starting from a given set of parameters, we first perform MD simulations to produce a collection of 3D structures for the diploid genome and various nuclear bodies.These structures are then transformed into a contact map or contact probabilities between chromatin beads and nuclear landmarks by averaging over homologous chromosomes.Constraints corresponding to different energy terms could be obtained from the simulated results and compared with those  estimated from Hi-C, SON TSA-Seq, and Lamin B DamID profiles.Finally, the model parameters were updated based on the difference between simulated and experimental constraints using the adaptive moment estimation (Adam) optimization algorithm (Kingma and Ba, 2014).The three steps can be repeated with updated parameters to improve the simulation-experiment agreement further.
No quantitative experimental data exists for interactions among nuclear body particles to serve as constraints.We varied the strength of the interaction potential to produce 2-3 nucleoli and ∼30 speckle clusters during the simulations (Figure 2-figure supplement 1) while ensuring the fluidity of the resulting droplets.

Molecular dynamics simulations with GPU acceleration
We implemented the nucleus model into the MD engine OpenMM (Eastman et al., 2017).OpenMM offers an excellent interface with Python scripting, significantly improving the readability and customizability of the model.The code was designed into functional modules, with different components, such as chromosomes and nuclear landmarks, written as separate classes.This design further facilitates the introduction of additional nuclear components, if desired, with minimal changes to existing code.We provide examples of simulation set up, trajectory analysis, parameter optimization, and introducing new features in the GitHub repository.
Figure 3A illustrates the workflow for setting up and executing whole nucleus simulations.A configuration file that provides the position of individual particles in the PDB file format is needed to initialize the simulations.This file also contains topological information regarding whether a particle represents chromosomes or nuclear landmarks and the identity of specific chromosomes.The input file can be generated with provided Python scripts by randomly distributing the positions of chromosomes, speckles, and nucleoli, though optimized configurations are also included in the GitHub repository.By default, the lamina particles will be uniformly placed on a sphere of 10 μm in diameter.Upon parsing the configuration file, interactions among various components can be set up with optimized parameters.This step will produce an object that can be used for MD simulations.As shown in Figure 3B, the workflow only requires a few lines of code.The package also includes analysis scripts to compute contact maps, monitor conformational dynamics, and track nuclear bodies.A significant benefit of OpenMM is its native support of GPU acceleration.As shown in Figure 3C, the simulation speed with one Nvidia Volta V100 GPU is 150 times faster than that of the four Intel Xeon Platinum 8260 CPU cores.Notably, this performance enhancement cannot be achieved by simply increasing the CPU core numbers.For example, the simulation speed with 32 CPU cores is less than twice that of 4 CPU cores, potentially due to the system's heterogeneous distribution of particles.

Simulations reproduce and predict diverse experimental data
We extensively validated the parameterized nucleus model to examine its biological relevance.MD simulations initialized from 50 different initial configurations were performed to build an ensemble of structures.As mentioned in the following section, a diverse set of initial configurations is essential for reproducing interchromosomal contacts probed in Hi-C.From the simulated structures, we computed various quantities for direct comparison with experimental measurements.Given that the majority of experimental data were analyzed for the haploid genome, we adopted a similar approach by averaging over paternal and maternal chromosomes to facilitate direct comparison.More details on data analysis can be found in Appendix 1, section 'Details of simulation data analysis'.
We compared the simulated contact probabilities among chromosomes with Hi-C data.As shown in Figure 4A and Figure 4-figure supplement 1, the simulated and experimental contact maps are highly correlated.The squares along the diagonal support the formation of chromosome territories that promote intrachromosomal contacts, and the apparent checkboard patterns follow the compartmentalization of various chromatin types.We further examined the decay of intrachromosomal contacts as a function of the sequence separation, which is known to deviate from that of an equilibrium globule (Lieberman-Aiden et al., 2009).As shown in Figure 4B, the simulated results overlap well with the Hi-C data (orange curve).In addition, the simulated average contact probabilities between various compartment types match values estimated from Hi-C data.Moreover, the simulated and experimental average contact probabilities between pairs of chromosomes agree well, and the Pearson correlation coefficient between the two datasets reaches 0.89.
We further examined the contacts between chromosomes and nuclear landmarks.As illustrated in Figure 4C, the simulated Lamin-B DamID signals for chromosome 7 match well with the experimental results, capturing the complex contact pattern that weaves chromatin toward and away from the nuclear envelope.Similarly, SON TSA-Seq data that quantify the contact between chromosomes and speckles are well captured by simulated structures.The anti-correlation between DamID and TSA-Seq is clearly visible.The observed agreement between simulation and experimental results is not limited to any particular chromosome.Good agreements are achieved for all chromosomes.
The simulations also provide 3D representations of the nucleus that can be compared with DNA-MERFISH data (Su et al., 2020).We found that the simulated radius of gyration of individual chromosomes matches well with experimental values (Figure 4A).The simulated and experimental average normalized chromosome radial positions also correlate strongly, as shown in Figure 5B.We note that while the sequencing results presented in Figure 4 were used for model parameterization, the MERFISH data were not.Therefore, the simulation results here are de novo predictions, and their agreement with experimental data strongly supports the coupled assembly mechanism used for designing the energy function.
A significant advantage of MD simulation-based models is the dynamical information they naturally produce.We measured the dynamics of telomeres by tracking the mean-square displacements (MSDs), ⟨r 2 (∆t)⟩ , as a function of time.In Figure 5C, we plot representative MSD trajectories over a 1-hr timescale.In line with previous research (Di Pierro et al., 2018;Bronstein et al., 2009;Lee et al., 2021), telomeres display anomalous subdiffusive motion.When fitted with the equation ⟨r 2 ( ∆t ) ⟩ = Dα∆t α , these trajectories yield a spectrum of α values, with a peak around 0.59.

The exponent and the diffusion coefficient
both match well with the experimental values (Bronshtein et al., 2015;Jack et al., 2022), upon setting the nucleoplasmic viscosity as 1Pa • s (see Appendix 1, section 'Mapping the reduced time unit to real time' for more details).
The good agreement in the dynamics of individual loci further inspired us to examine the diffusion of whole chromosomes.In particular, we plotted the normalized chromosome radial positions as a function of time in Figure 6A.Remarkably, we found that chromosomes appear arrested and no significant changes in their positions are observed over timescales comparable to the cell cycle (see

Heterogeneity and robustness of the simulated conformational ensemble
The lack of relaxation of chromosome radial positions suggests the importance of starting configurations used to initialize the simulations.Statistical averages of the resulting ensemble of nuclear structures depend crucially on these starting configurations.Using an optimization procedure, we selected them from 1000 configurations to maximize the agreement with experimental lamin-B DamID and interchromosomal contact probabilities.Appendix 1, section 'Initial configurations for simulations' provides more details on preparing the 1000 initial configurations.
We selected a total of 50 starting configurations to initiate independent simulations.Smaller sets of starting configurations are not sufficient to reproduce the interchromosomal contact probabilities, as shown in Figure 6-figure supplement 2B.Notably, different sets of 50 configurations selected from independent trials show significant overlap (Figure 6-figure supplement 2D), supporting the robustness of the selection protocol in detecting conserved features of genome organization.
While the ensemble as a whole is relatively robust, individual configurations with the ensemble exhibit significant differences.For example, the Lamin B DamID profiles produced from different trajectories are only weakly correlated (Figure 6C), with an average correlation coefficient of 0.53.These weak correlations result from significant differences in the normalized radial positions of chromosomes, as can be seen in representative configurations from two simulation trajectories (Figure 6B).The fluctuations of normalized radial positions cause changes in contacts between chromosomes as well, resulting in little correlation between interchromosomal contact matrices (Figure 6D).
We examined genome organizations reported by Su et al. and found a similar variation of interchromosomal contact probabilities across individual cells (Figure 6-figure supplement 2A and D).Notably, the simulated configurations capture the fluctuations of interchromosomal contacts observed in DNA-MERFISH data, further supporting the biological relevance of the reported in silico structures.
Despite the differences in interchromosomal contacts across trajectories, high conservation of connections between chromosomes and speckles can be observed in individual simulations.For example, the average correlation coefficient between in silico SON TSA-Seq profiles produced from different trajectories is 0.72, much higher than the corresponding value for Lamin B DamID profiles.Conservation of contacts between chromosomes and nuclear bodies (zones) across individual cells has indeed been reported in a previous study that simultaneously images chromatin and various subnuclear structures (Takei et al., 2021).

Nuclear deformation preserves chromosome-nuclear body contacts
Numerous studies have highlighted the remarkable influence of nuclear shape on the positioning of chromosomes and the regulation of gene expression (Brahmachari et al., 2022;Contessoto et al., 2023).The nucleus, once regarded as a mere compartment for DNA storage, is increasingly recognized as a dynamic and intricately structured organelle.To better understand the interplay between nuclear shape and genome organization as a fundamental mechanism that shapes the transcriptional landscape, we performed additional simulations in which the nuclear lamina was altered from a sphere into more ellipsoidal shapes by applying a force along the z-axis (Figure 7A).More details about these simulations can be found in Appendix 1, section 'Nuclear envelope deformation simulations'.
As illustrated in Figure 7B, the presence of external forces resulted in significant alterations in nuclear shape.We conducted two independent simulations with different force strengths, leading   deformation, whereas TSA-Seq signals were much less affected and remained highly correlated with the results from the spherical nucleus simulations.
Therefore, it appears that speckles, and potentially other nuclear condensates, can dynamically reorganize in response to changes in chromosome conformations to maintain contacts with genomic loci.This robustness in nuclear body contacts may be essential for ensuring the robust functioning of the genome in a population of cells with significant variability in nuclear shape.

Discussion
We introduced a computational model, OpenNucleome, to facilitate simulations for the human nucleus at high structural and temporal resolution.We conducted extensive cross-validation with experimental data to support the biological relevance of simulated 3D structures.Implementing the model into the MD package, OpenMM enables GPU acceleration for long-timescale simulations.Tutorials in the format of Python Scripts with extensive documentation are provided to facilitate the adoption of the model by the community.
Our software enhances the capabilities of existing genome simulation tools Fujishiro and Sasai, 2022;Yildirim et al., 2023;Oliveira Junior et al., 2021.Specifically, OpenNucleome aligns with the design principles of Open-MiChroM (Oliveira Junior et al., 2021), prioritizing open-source accessibility while expanding simulation capabilities to the entire nucleus.Similar to software from the Alber lab (Yildirim et al., 2023), OpenNucleome offers high-resolution genome organization that faithfully reproduces a diverse range of experimental data.Furthermore, beyond static structures, OpenNucleome facilitates dynamic simulations with explicit representations of various nuclear condensates, akin to the model developed by Fujishiro and Sasai, 2022.A significant advantage of OpenNucleome lies in its predictive power for dynamical information.For example, the model succeeded in reproducing the subdiffusive behavior of telomeres.We further showed that the dynamics of individual chromosomes are slow and their radial positions do not relax over the time course of a cell cycle.This is consistent with previous theoretical estimations on chromosome dynamics (Rosa and Everaers, 2008) and recent observations of solid behavior of chromatin in vivo (Strickfaden et al., 2020).Live cell experiments that directly track the positions of multiple chromosomes could further validate/falsify this prediction.We anticipate the model will greatly facilitate the investigation of the dynamics of genomic loci and nuclear bodies and the interpreting of live cell imaging results.
Slow chromosome dynamics and a lack of conformational relaxation naturally result in the heterogeneity of chromosome radial positions across individual cells.This heterogeneity raises doubts about the notion that chromosome radial positions provide robust and reliable mechanisms for gene regulation (Hübner et al., 2013;Maeshima et al., 2010;Fraser and Bickmore, 2007;Takizawa et al., 2008).Instead, our results support the nuclear zoning model for gene regulation (Takei et al., 2021), where specific loci function as 'fixed points' anchored to certain nuclear bodies in all cells.This anchoring mechanism robustly creates the desired molecular environment surrounding these genomic segments.Unlike chromosome radial positions, contacts between genomic loci and speckles can be robustly established in individual cells, as shown in our simulations.It was achieved through a nucleation process that attracts speckle particles toward specific loci due to specific interactions.Nucleation occurs much more rapidly than chromosome rearrangement due to the smaller size of speckle particles.The coupled self-assembly mechanism for chromosomes and nuclear bodies can similarly facilitate the formation of other nuclear zones for different kinds of fixed points.
Despite the heterogeneity in chromosome positions and interchromosomal contacts, the ensemble of nuclear structures as a whole is not random and exhibits conserved features.For example, on average, certain chromosomes remain closer to the nuclear envelope than others (see Figure 5B).Similarly, the average contact frequency between certain chromosome pairs is higher than others, though this trend can be frequently violated in individual cells.How such conserved features arise as cells exit from the mitotic phase remains unclear and would be interesting for further explorations.

Molecular dynamics simulation details
We used the software package OpenMM Eastman et al., 2017 to perform MD simulations in reduced units at constant temperature (T = 1.0).Unless otherwise specified, we froze the lamina particles and only propagated the dynamics of chromatin, nucleoli, and speckles.
Two integration schemes were used with a time step of dt = 0.005to efficiently generate structural ensembles and produce realistic dynamical information, respectively.For simulations used in parameter optimization and building structural ensembles, we employed the Langevin integrator with a damping coefficient of γ −1 = 10.0 .In the case of MSD calculations shown in Figure 6, we utilized Brownian dynamics with a damping coefficient of γ −1 = 0.01 .The higher damping coefficient provides a better approximation to the viscous nucleus environment, while the smaller value in the Langevin integrator facilitates conformational sampling with faster diffusion rates.
We employed the semi-grand Monte Carlo technique (Sadigh et al., 2012) to simulate chemical transitions between two types of speckle particles.At every 4000 simulation steps, we attempt a total of N Sp chemical reactions that converts one type of speckle particles to the other type with a probability of 0.2.N Sp corresponds to the total number of speckle particles, and the switching probability was chosen to be comparable to the experimental phosphorylation rate.More details on the speckle dynamics are provided in Appendix 1, section 'Speckles as phase-separated droplets undergoing chemical modifications'.
When deforming the nuclear envelope, we unfroze the lamina particles and evolved them dynamically as the rest of the nucleus.Bonded interactions among lamina particles held the nuclear envelope together as a particle mesh.A harmonic force along the z-axis was introduced to compress the particle mesh.More details are provided in Appendix 1, section 'Nuclear envelope deformation simulations'.
For simulations used to optimize parameters, a total of 50 independent 3-million-step-long trajectories were performed.Configurations were recorded at every 2000 simulation steps for analysis.The first 500,000 steps of each trajectory were discarded as equilibration.For production simulations, we performed 50 independent 10-million-step long trajectories starting from different initial configurations.Nuclear structures were again recorded at every 2000 steps to determine statistical averages presented in the article.An additional eight simulations of 30million steps in length were performed to compute telomere MSDs.
We mapped the reduced units to real units with the conversion of length scale σ = 385nm and the timescale in Brownian dynamics simulations τ = 0.65s .These conversions were determined as detailed in Appendix 1, section 'Unit conversion'.

Experimental data processing and analysis
We obtained the in situ Hi-C data, SON TSA-seq data, and Lamin-B DamID data of HFF cell lines from the 4DN data portal.The intra and interchromosomal interactions were calculated at 100KB resolution with VC_SQRT normalization applied to the interaction matrices.Hi-C data extraction and normalization were performed using Juicer tools (Durand et al., 2016).We followed the same processing and normalization method described in Zhang et al., 2021 to analyze TSA-seq data.Two biological replicates of Lamin-B DamID data were merged and the normalized counts over Dam-only control were used for analysis.The SON TSA-Seq and Lamin-B DamID data were processed at the 25KB resolution and the average values at the 100KB resolution were used in Figure 4 for model validation.

Funder Grant reference number Author
National Institute of General Medical Sciences

R35GM133580 Bin Zhang
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.(3)

Data availability
For simplicity, we assume the forward transition rate from Sp-P to Sp-dP particles is identical to the reverse rate.Because of the symmetry in transition rates, the average number of dP particles ⟨ N Sp-dP ⟩ = 0.5N Sp , where N Sp is the total number of speckle particles.We chose the transition probability as 0.2 to be consistent with the phosphorylation rate.In particular, we estimate the rate as where τ is the time interval between consecutive attempts of chemical reactions.As detailed in section 'Molecular dynamics simulation details', the reactions were attempted every 4000 simulation steps, with a time step of 0.005.The time unit in our simulations is 0.65 s (see section 'Mapping the reduced time unit to real time').The estimated value for k 12 is in the same order as the experimental phosphorylation rate (Velazquez-Dones et al., 2005).
We estimated the total number of speckle particles as follows.Assuming that there is a total of 30 speckles (Galganski et al., 2017), we have N Sp-dP = 30 × Ns , where Ns is the number of Sp-dP particles in each cluster.This estimation assumes that only Sp-dP particles share attractive interactions and contribute to cluster formation.From the experimentally estimated relative mass densities of the protein concentrations in the speckle and nucleolus droplet as 170 203 (Handwerger et al., 2005) (5) We assumed that speckle and nucleolus particles have identical mass and each nucleolus has 100 particles.The radius for speckle and nucleolus was approximated as 0.3 and 0.5µm , yielding Ns ≈ 20 and N Sp-dP ≈ 600 .Because of the kinetic scheme defined in Equation 3, only parts of Sp-dP particles will participate in droplet formation during the simulations.Therefore, we increase the particle number and set ⟨ N Sp-dP ⟩ = 800 , which yields N Sp = 1600 .

Energy function of the whole nucleus model
As detailed below, the energy function of the whole nucleus, U Nucleus , consists of interactions among chromosomes, among nuclear landmarks, and cross interactions between the two.Therefore,

Hi-C inspired interactions for the diploid human genome
The energy function of the genome model is defined as determines a generic polymeric topology of chromosomes with excluded volume effect: where the subscripts i, i + 1 , and i+2 represent the index of denote the bonding and angular potential applied for neighboring beads to ensure the connectivity of the chromatin chain and follow: where, as discussed in Equation 32, r 0 = 0.5σ represents the size of the chromatin bead.The softcore potential provides excluded volume effects for pairs of beads from the same or different chromosomes and follows: where usc ( r ij ) denotes a soft-core potential added to each pair formed by beads index i and j to account for the excluded volume effect while allowing the finite probability of cross-over of polymer chains.
which corresponds to the Lennard-Jones potential capped off at a finite volume within a repulsive core to allow for chain crossing at a finite energy cost.Ecut = 4ϵ and rcut is chosen as the distance at which is the intra-chromosomal potential applied to genomic loci within the same chromosome, while Ucompt ( r ) is the compartment-specific interaction potential.The ideal potential, which can be rigorously derived following the maximum entropy principle (Roux and Weare, 2013;Zhang and Wolynes, 2015), adopts the following form: where I indexes over each chromosome and i and j index over pair of beads on that chromosome.α ideal (|i − j|) depends only on the sequence separation between two beads i and j. f(r ij ) measures the probability of contact formation for two loci separated by a distance of r ij , and its ensemble average corresponds to the contact probability measured in Hi-C experiments.f(r ij ) adopts the form The numerical value of r c was determined from the Hi-C contact map, as detailed in the next section.This contact probability function depicts that when r < rc, f ≈ 1 but when r > rc, f ≈ ( rc/r ) 4 .
The power-law decay with an exponent of 4 is consistent with the relationship between contact probability and spatial distances revealed in imaging studies (Qi and Zhang, 2019;Wang et al., 2017).
The tanh function ensures the continuity of the function and its derivative around r c (Appendix 1figure 1).Additionally, we truncated the ideal potential to be applicable for a sequence separation less than or equal to 100 MB and set the parameters for larger sequence separations to be zero.As shown in Figure 4 of the main text, our parameterized ideal potential produced chromosomes with sizes comparable to imaging results.Incorporating longer-range interactions to improve the model further is straightforward but would also significantly increase the number of parameters.Similar to the ideal potential discussed above, we have where T i and T j denote the compartment types of beads i and j which can be A, B, or C. Therefore, CG beads of the same compartment types will share the same interaction parameter αcompt ( T i , T j ) , which will be derived from average Hi-C contact frequencies as detailed in the following sections.
To account for specific interactions between chromosomes, we introduced the interchromosomal potential as I, J ∈ { 1, 2, . . ., 23 } index the haploid chromosomes, and parental and maternal chromosomes share identical parameters.This potential allows the model to capture interactions beyond those arising purely from compartmentalization as defined in Equation 14.
All parameters in the energy function are summarized in Appendix 1-table 1.The procedure used for parameter optimization is detailed in the following sections.
with ro = 0.5σ .i indices all the lamina particles, and j represents the nearest four neighbors around i determined from the initial configuration for which the particles were placed on a Fibonacci grid.To avoid pairs (i, j) being counted twice or more, we set j always larger than i.
Short-ranged, non-bonded interactions were introduced among nuclear landmark particles to account for attractions that promote phase separation and the excluded volume effect.These interactions were modeled with a cut and shifted Lennard-Jones (LJ) potential defined as with Ecut = 4ϵ ) .We note that when rcut was set as σ LJ × 2 1/6 , the potential has Optimization of the whole nucleus model parameters Below, we describe the procedures used to derive model parameters.

Connecting imaging and Hi-C data with the contact function
The function f(r) defined in Equation 13 was used to determine the chromatin contact probabilities.The availability of spatial positions and Hi-C data makes possible the definition of a contact function, f(r), that converts distances into contact probabilities.In particular, we determined r c as the value at which the simulated average interchromosomal contact probability ⟨ f ( rc )⟩ sim inter matches the experimental value, that is, The angular brackets represent ensemble averaging, performed using the structures at 100 KB resolution reported in our previous work (Kamat et al., 2023).Matching simulation and experimental values produced rc = 0.54σ ≈ 208 nm.We note that this estimation for r c is comparable to the average bond length (0.5 σ), thus ensuring that nearest neighbor genomic regions with contact probability close to 1, that is, ⟨f

Adam optimizer for chromosome interaction parameters
Mathematical expressions for the various energy terms in U Genome were designed such that their ensemble averages can be mapped onto combinations of contact frequencies measured in Hi-C.The correspondence between the energy functions and Hi-C measurements allows model parameterization with an efficient adaptive moment (Adam) algorithm (Kingma and Ba, 2014).
, and α inter ( I, J ) were tuned to satisfy the following constraints: where δ Ti,T1 is the Kronecker delta function with the following definition: The angular bracket represents the ensemble average, and f exp ij is the corresponding experimental contact frequency.
During the optimization process, our aim was to minimize the disparity between experimental findings and simulated data.To achieve this, we defined the cost function as follows: where the index i iterates over all the constraints defined in Equation 28.
The details of the algorithm for parameter optimization are as follows: 1. Starting with a set of values for , and α inter ( I, J ) , we performed 50 independent 3-million-step long MD simulations to obtain an ensemble of nuclear configurations.The 500K steps of each trajectory are discarded as equilibration.We collected the configurations at every 2000 simulation steps from the rest of the simulation trajectories to compute the ensemble averages defined on the left-hand side of Equationi 13.
2. Check the convergence of the optimization by calculating the percentage of error defined as The summation over i includes all the average contact probabilities defined in Equation 28. 3.If the error is less than a tolerance value e tol , the optimization has converged, and we stop the simulations.Otherwise, we update the parameters, α, using the Adam optimizer (Kingma and Ba, 2014).With the new parameter values, we return to step one and restart the iteration.

Adam optimizer for chromosome-nuclear body interaction parameters
Similar to those among chromatin particles, the interaction parameters between chromatin and nuclear landmarks were optimized with Adam's algorithm to reproduce experimental constraints.
The constraints that we aimed to reproduce were defined as follows: where C La i and C Sp i measure the contacts between chromatin bead i and nuclear lamina and speckles, respectively, as defined in Equations 42 and 45.LAF i and LAF i denote the lamina and speckle association frequency for chromatin bead i as measured in Lamin B DamID and SON TSA-Seq experiments.N denotes the number of chromatin beads.We combined the constraints defined in Equation 31 with those in Equation 28 to simultaneously optimize the parameters using the iterative algorithm outlined in the previous section.We note that the interaction potential between chromatin and speckles defined in Equation 25 did not use precisely the same function as in C Sp i .We chose to sum over all speckle dP particles, rather than identifying the droplets, which is difficult to do during the simulations.

Parameter optimization for nuclear body-nuclear body interactions
As much remains to be known about the organization of nuclear bodies, we designed the interaction potentials and parameters based on qualitative observations without extensive fine-tuning.For example, we used the standard Lennard-Jones potential (Equation 18) to mimic short-range interactions.The lengthscales, σ LJ , in these potentials, were chosen based on a linear combination of the size of interacting particles, as discussed in section 'Unit conversion'.
The interaction strength, ϵ LJ , was set as 1.0 to be on the same order as thermal energy ( k B T ), when the potential was used to account for the excluded volume effect.
For attractive interactions that promote phase separation and nuclear body formation, we set ϵ LJ = 3.0 .Smaller values failed to produce clustered nucleoli, while much larger values significantly decreased the fluidity of the resulting droplets.The same value was used for speckle dP particles and produced droplet numbers comparable to experimental observations (Figure 2-figure supplement 1).

Unit conversion
The reduced unit for length scale is noted as σ.We set the nucleus radii as 13σ.Assuming a nucleus with an average size of 5 μm, we have σ = 385 nm.

Mapping chromatin bead size to real unit
We estimated the size of the chromosome bead as 192.5 nm based on super-resolution imaging data as follows.The median radius of gyration has been shown to follow a power-law scaling as a function of domain length with an exponent of 0.3 (Boettiger et al., 2016).Assuming that the radius of a domain is proportional to the radius of gyration, we have We previously estimated the size of 1 MB bead as R 1MB = σ = 385 nm, and Equation 32 yields the size of 100 KB as R 100KB = 0.5σ .
Setting the nucleoplasmic viscosity as 1Pa • s produces τ B ≈ 0.65s .This mapping produced diffusion coefficients and MSD curves that match well with experimental measurements presented in Bronshtein et al., 2015, as discussed in the main text.We note that the chosen value for the nucleoplasmic viscosity indeed falls into the range of reported experimental values from 10 −1 Pa • s to 10 2 Pa • s (Platani et al., 2002;Tseng et al., 2004).

Molecular dynamics simulation details Initial configurations for simulations
Due to the slow relaxation dynamics of whole chromosomes relative to the simulation timescale, the reported results are sensitive to the configurations used to initialize the simulations.Therefore, we designed the following protocol to prepare the initial configurations and ensure the biological relevance of simulation results.
We first created a total of 1000 configurations for the genome by sequentially generating the conformation of each one of the 46 chromosomes as follows.For a given chromosome, we by placing the first bead at the center (origin) of the nucleus.The positions of the following beads, i, were determined from the ( i − 1 ) -th bead as r i = r i−1 + 0.5v .v is a normalized random vector, and 0.5 was selected as the bond length between neighboring beads.To produce globular chromosome conformations, we rejected vectors, v, that led to bead positions with distance from the center larger than 4σ .Upon creating the conformation of a chromosome i, we shift its center of mass to a value r i com determined as follows.We first compute a mean radial distance, r i o with the following equation: where D i is the average value of Lamin B DamID profile for chromosome i.D hi and D lo represent the highest and lowest average DamID values of all chromosomes, and 6σ and 2σ represent the upper and lower bound in radial positions for chromosomes.As shown in Appendix 1-figure 2, the average Lamin B DamID profiles are highly correlated with normalized chromosome radial positions as reported by DNA MERFISH (Su et al., 2020), supporting their use as a proxy for estimating normalized chromosome radial positions.We then select r i com as a uniformly distributed random variable within the range ] .Without loss of generality, we randomly chose the directions for shifting all 46 chromosomes.We further relaxed the 1000 configurations to build more realistic genome structures.Following an energy minimization process, 1-million-step MD simulations were performed starting from each configuration.Simulations were performed with the following energy function: where U Genome is defined as in Equation 7. U G-La is the excluded volume potential between chromosomes and lamina, that is, only the second term in Equation 24.Parameters in U Genome were from a preliminary optimization.The end configurations of the MD simulations were collected to build the final configuration ensemble (FCE).We further computed the Pearson correlation coefficient of pairwise interchromosomal contacts between different structures in FCE (see section 'Computing pairwise interchromosomal contact probabilities').As shown in Figure 6-figure supplement 2A, the probability distribution of these correlation coefficients is comparable with that determined from DNA-MERFISH structures, supporting the biological relevance of the structural diversity in the constructed ensemble.
From 1000 relaxed configurations, we selected a subset of structures to initialize simulations presented in the main text.An optimization procedure was introduced for structure selection.We start this procedure by randomly select N structures to build the initial configuration ensemble (ICE).We then iteratively go through every configuration in ICE and replace with a structure from FCE that's not already included in ICE.We then compute the Pearson correlation coefficient between new average ICE interchromosomal contact probabilities and experimental values.If the Pearson correlation coefficient is higher than the value determined from the original ICE, the new structure is accepted and the ICE is updated.Otherwise, the new structure is rejected.We stop the selection process for when the Pearson correlation coefficient stops improving.
We found that as N increases, the agreement between ICE interchromosomal contact probabilities and experimental values continue to increase (Figure 6-figure supplement 2B).We set N = 50, which produces a Pearson correlation coefficient between ICE and experimental interchromosomal contact probabilities of 0.9.Further increasing N does not significantly improve the agreement but incurs more computational cost.
It is worth noting that the outcomes of the selection procedure depend on the initial set of configurations included in ICE at the beginning.However, we found that the ICEs produced from 20 independent trials are highly correlated (Figure 6-figure supplement 2C) and all reproduce the heterogeneity in interchromosomal contacts seen in DNA MERFISH data (Figure 6-figure supplement 2D).Therefore, the selection procedure is robust and can produce biologically meaningful configurations to initialize simulations.
With the chromosome positions prepared, we randomly placed 300 nucleoli and 1600 speckle particles inside the nucleus to complete the set up of initial configurations.

Langevin dynamics simulations
We used the Langevin integrator with the damping coefficient γ −1 = 10 to control the temperature at T = 1.0 for simulations used for parameter optimization and for producing an ensemble of nucleus structures.Langevin dynamics simulations allow faster chromosome movements, compared to Brownian dynamics simulations, facilitating the conformational sampling.In these simulations, the lamina particles were frozen and no explicit dynamics were considered for the nuclear envelope.

Brownian dynamics simulations
We also performed Brownian dynamics simulations with damping coefficient γ −1 = 10 −2 to control the temperature at T = 1.0.These simulations provide better approximations of the overdamped dynamics of chromatin for direct comparison with live cell imaging studies.As detailed in section 'Unit conversion', upon mapping the coarse-grained timescale to the physical unit, Brownian dynamics simulations produce diffusion coefficients for telomeres comparable to experimental values (see Figure 5).

Nuclear envelope deformation simulations
We performed Langevin dynamics simulations to investigate the impact of nuclear envelope deformation on genome organization.To induce a compressing force along the z-axis, we introduced a harmonic potential in the form of where z i is the z coordinate of the ith lamina bead, and N La represents the total number of lamina beads.The particles in the system evolve under the combined effect of Ucompress and U Nucleus defined in Equation 6.

Details of simulation data analysis
The computer simulations yield 3D coordinates of the diploid genome.However, when comparing directly with experimental data processed for the haploid genome, unless stated otherwise, we computed averages across paternal and maternal chromosomes to ascertain various genome-wide properties as listed below.

Computing simulated contact probabilities
Simulated contact probability maps were computed by averaging over chromosome configurations collected from all trajectories.For a given configuration, the contact probability between two chromatin segments (i and j) was evaluated using the contact function defined in Equation 13.

Computing the Pearson correlation coefficients between experimental and simulated contact maps
We computed the Pearson correlation coefficients (PCCs) between experimental and simulated contact maps in Figure 4A and Figure 4-figure supplement 1 as where x i and y i represent the experimental and simulated contact probabilities, and n is the total number of data points.Only non-redundant data points, that is, half of the pairwise contacts, are used in the PCC calculation.

Computing pairwise interchromosomal contact probabilities
For a given genome structure, we computed the pairwise interchromosomal contacts as follows.
For every pair of chromosomes, we determined their contact probability by averaging all genomic pairs from two chromosomes using Equation 13.We then averaged over all four pairs of diploid chromosomes to compute the haploid average contacts.In total, there are C 2 22 = 231 contact pairs between haploid chromosomes excluding the sex chromosomes.

Distances from nuclear bodies and association frequencies
The contacts of a chromatin bead i with the nuclear lamina were evaluated as with rc = 0.75σ .We average over the ensemble of nuclear configurations and homologs to compute the in silico Lamin B DamID signal as where the angular brackets indicate ensemble averaging.CLa is defined as the genome wide average of For chromatin-speckle contacts, we first identified the speckles formed at any given structure using the density-based spatial clustering algorithm DBSCAN (Ester et al., 1996) as implemented in the scikit library for Python (Pedregosa et al., 2011).For the identified droplets, we computed their center of mass coordinates, → r com and the radius of gyration, R. With the identified clusters, we then determined the distance from the ith chromatin bead to the sth speckle as where || • || represents the L2 norm.We subtract the radius of the speckle cluster in the above equation to determine the distance to the droplet surface.From the list of distances to different speckles, the contact between chromatin bead i and speckles is computed as where we sum over all the N s speckle clusters.A similar expression was used for determining the contacts between chromatin and nucleoli.Finally, we average over the ensemble of nuclear configurations and homologs to compute the in silico SON TSA-Seq signal as where the angular brackets indicate ensemble averaging.CSp is defined as the genome wide average of Computing simulated normalized chromosome radial positions For a given chromosome i, we first determined its center of mass position denoted as C i .Starting from the center of the nucleus, O, we extend the vector v OC to identify the intersection point with the nuclear lamina as P i .The normalized radial position of chromosome i is then defined as

Computing simulated chromosome radii of gyration
The radius of gyration for a chromosome is computed as where rcom and n are the center of mass and the number of beads of the chromosome.i indices over all the chromosome beads and r i correspond to the Cartesian coordinates of bead i. ||.|| represents the L2 norm.
Computing simulated mean-square displacement MSD for telomeres were computed as where ∆t, δt , and Nstep represent the time interval, the time step, and the total number of steps, respectively.The summation over t corresponds to averaging over eight independent trajectories.MSDs telomeres from paternal and maternal chromosomes are separately computed and analyzed.

Details of experimental data analysis
Interchromosomal contacts from DNA MERFISH data We collected the DNA MERFISH data reported in Su et al., 2020 to construct the experimental ensemble of 5455 genome structures.For each structure, we computed the pairwise interchromosomal contacts following the procedure outlined in section 'Computing pairwise interchromosomal contact probabilities'.
To better visualize and analyze interchromosomal contacts, we applied the Uniform Manifold Approximation and Projection (UMAP) technique as implemented in software package umap-learn (McInnes et al., 2018;Moshtagh, 2005), with default parameters to reduce the 231 haploid contacts into two dimensions.All 5455 DNA MERFISH structures were included in this analysis.
The same transformations produced from the UMAP analysis of experimental structures were applied to in silico configurations to produce results shown in

Computing experimental normalized chromosome radial positions
We followed the same procedure outlined in section 'Computing simulated normalized chromosome radial positions' to compute the experimental values.To determine the center of the nucleus using DNA MERFISH data, we used the algorithm, minimum volume enclosing ellipsoid (MVEE) (Moshtagh, 2005), to fit an ellipsoid for each genome structure.The optimal ellipsoid defined as

Figure 1 .
Figure1.Computer model of the human nucleus for structural and dynamical characterizations.(A) 3D rendering of the nucleus model with particlebased representations for the 46 chromosomes shown as ribbons, the nuclear lamina (gray), nucleoli (cyan), and speckles (yellow).As shown on the right, chromosomes are modeled as beads-on-a-string polymers at a 100 KB resolution, with the beads further categorized into compartment A (red), compartment B (light blue), or centromeric regions (green).(B) Speckle particles undergo chemical modifications concurrent to their spatial dynamics, and the de-phosphorylated (dP) particles contribute to droplet formation.(C) Illustration of the ideal and compartment potential that promotes chromosome compaction and microphase separation.Specific interactions between chromosomes and nuclear landmarks are shown on the right.

Figure 2 .
Figure 2. Overview of the iterative algorithm for parameterizing the nucleus model with experimental data.Starting from an initial set of parameters, we perform molecular dynamics (MD) simulations to produce an ensemble of nuclear structures.These structures can be transformed into contacts between chromosomes or between chromosomes and nuclear landmarks for direct comparison with experimental data.Differences between simulated and experiment contacts are used to update parameters for additional rounds of optimization if needed.The online version of this article includes the following figure supplement(s) for figure 2: Figure supplement 1.The number of speckle clusters formed along a typical simulation trajectory.

Figure 3 .
Figure 3. OpenNucleome facilitates GPU-accelerated simulations of the human nucleus.(A) Illustration of workflow for setting up, performing, and analyzing molecular dynamics (MD) simulations.(B) Python scripts setting up whole nucleus simulations.(C) Performance of MD simulations on different number of CPU cores and a single GPU.

Figure 4 .
Figure 4. Simulated structures reproduce contact frequencies between chromosomes and between chromosomes and nuclear landmarks.(A) Comparison between simulated (top right) and experimental (bottom left) whole-genome contact probability maps with Pearson correlation coefficient r = 0.89.Zoom-ins of various regions are provided in Figure 4-figure supplement 1. (B) Comparison between simulated and experimental average contact frequencies, including average contacts between genomic loci from the same chromosomes at a given separation (top), average Figure 4 continued on next page Figure supplement 1. Zoom-in of various regions in the contact map presented in Figure 4 further supports the agreement between simulation and experiment.

Figure supplement 2 .
Figure supplement 2. Configurations used to initialize simulations capture the heterogeneity in interchromosomal contacts seen in DNA-MERFISH data.

Figure 7 .
Figure 7. Nuclear deformations influence genome organization while preserving chromatin-speckle contacts.(A) Illustration of force-induced nuclear envelope deformation.The nuclear lamina is modeled as a particle mesh where neighboring lamina particles are covalently bonded together.(B) Example nucleus conformations at different strengths of applied force.(C) Pearson correlation coefficients between results from simulations of deformed nuclei and those from a spherical nucleus for interchromosomal contacts (left), DamID profiles (middle), and TSA-Seq (right).The values at zero force were computed from two independent simulations starting from the same initial configurations.The online version of this article includes the following figure supplement(s) for figure 7:Figure supplement 1. Impact of nuclear deformation on normalized chromosome radial positions.
Figure 6-figure supplement 2C and D .
( x − c ) T A ( x − c ) ≡ 1 is obtained by optimizing min ( log ( det [ A ]))subjecting to the constraint that( x i − c ) T A ( x i − c ) ≤ 1 .xi correspond to the list of chromatin positions determined experimentally.figure3.Parameters of the ideal potential, α ideal , as defined in Equation12.Numerical values for α ideal are included in the software's GitHub repository.