Skip to content

CosmoAI-AES/CosmoSim

 
 

Repository files navigation

title
The CosmoAI project

The CosmoSim software

This project provides a simulator for gravitational lensing based on Chris Clarkson's Roulettes framework. The initial prototype was an undergraduate final year project by Ingebrigtsen, Remøy, Westbø, Nedreberg, and Austnes (2022). The software includes both a GUI simulator for interactive experimentation, and a command line interface for batch generation of datasets.

The intention is to be able to use the synthetic datasets to train machine learning models which can in turn be used to map the dark matter of the Universe. This is work in progress.

Documentation is being written at https://cosmoai-aes.github.io/, but it still incomplete and fragmented.

Installation Guide

CosmoSim is set up to build python packages (wheels) that can be installed with pip.

pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ CosmoSim

The current distribution is only available on the PyPI test server. Starting with the next release, it will be available on the regular server and installable by

pip install CosmoSim

We successfully build CosmoSim for the following configurations.

  • Linux/x86_64, python 3.11 trough 3.14, and 3.14.t
  • MacOS/arm, python 3.11 trough 3.14, and 3.14.t

Building on Windows (amd64, python 3.11-3.14) sometimes works, but currently not. Buiilding for python 3.10 is impossible because tomllib is required.

To build locally from source, you can run (from the root of the repo),

pip install build
python -m build

This is highly dependent on the local configuration, and may fail for a number of reasons. If it succeeds, the binary file for the python module appears under src/CosmoSim.

If you can build, pip can also install the package from the working directory. Again, from the root of the repo,

pip install .

For non-standard building, see BUILD.md.

Running the GUI

python3 -m CosmoSim.GUI

The GUI tool is hopefully quite self-explanatory.
The images shown are the actual source on the left and the distorted (lensed) image on the right.

The Command Line Interface (CLI)

Overview of Tools

The modules that can be run as command line scripts are the following:

  • python -m CosmoSim.datagen (simulate images from lens parameters)
  • python -m CosmoSim.roulettegen (simulate images from roulette amplitudes)
  • python -m CosmoSim.roulettestatistics (descriptive statistics of roulette amplitudes)

Testing

Once built, an illustrative test set can be generated by the following command issued from the root directory. Note that you have to install python dependencies from the requirements file. You may want to install python libraries in a virtual environment. Here this is given to do that in global user space.

mkdir images
python3 -m CosmoSim.datagen -CR -Z 400 --csvfile Datasets/debug.csv --directory images

This generates a range of images in the newly created images directory. This should be on the .gitignore.

The flags may be changed; -C centres det distorted image in the centre of the image (being debugged); -Z sets the image size; -R prints an axes cross.

Dataset generation

The basic use case is bulk generation of images. The parameter distribution can be specified in a TOML file, see Datasets/dataset.toml for an example.

The following command generates a dataset.

python3 -m CosmoSim.datagen --toml dataset.toml --csvfile dataset.csv --outfile roulette.csv --directory images -C

Note the following

  • dataset.csv - contains the dataset with all the lens and source parameters.
  • The images directory contains the images generated from the dataset.
  • roulette.csv gives the dataset with roulette amplitudes.
  • -C centres the images on the centre of mass (luminence). This is necessary to avoid leaking information to a machine learning model. Most of the options are optional, and further options are available. See below.

Two-step generation

It is possible two generate the dataset and the images in two separate steps.

python3 -m CosmoSim.dataset input.toml output.csv
python3 -m CosmoSim.datagen --csvfile output.csv --outfile roulette.csv --directory images -C

Roulette Resimulation

TODO

python3 -m CosmoSim.roulettegen 

Generating individual images

To generate images from specified parameters, you can use

python3 -m CosmoSim.datagen -S sourcemodel -L lensmodel -x x -y y -s sigma -X chi -E einsteinR -n n -I imageSize -N name -R -C

Here are the options specified:

  • lensmodel is p for point mass (exact), r for Roulette (point mass), or s for SIS (Roulette).
  • sourcemodel is s for sphere, e for ellipse, or t for triangle.
  • -C centres the image on the centre of mass (centre of light)
  • -R draw the axes cross
  • x and y are the coordinates of the actual source
  • s is the standard deviation of the source
  • chi is the distance to the lens in percent of the distance to the source
  • einsteinR is the Einstein radius of the lens
  • n is the number of terms to use in roulette sum.
    (Not used for the point mass model.)
  • imageSize size of output image in pixels. The image will be imageSize$\times$imageSize pixels.
  • name is the name of the simulation, and used to generate filenames.
  • --help for a complete list of options.

Use cases

Training sets for roulette amplitudes

The datasets generated from datasetgen.py give the parameters for the lens and the source, as well as the image file. This allows us to train a machine learning model to identify the lens parameters, assuming a relatively simple lens model. It is still a long way to go to map cluster lenses.

An alternative approach is to try to estimate the effect (lens potential) in a neighbourhood around a point in the image. For instance, we may want to estimate the roulette amplitudes in the centre of the image. The datagen.py script can generate a CSV file containing these data along with the image, as follows:

mkdir images
python3 -m CosmoSim.datagen -C -Z 400 --csvfile Datasets/debug.csv \
        --directory images --outfile images.csv --nterms 5

The images should be centred (-C); the amplitudes may not be meaningful otherwise. The --directory flag puts images in the given directory which must exist. The image size is given by -Z and is square. The input and output files go without saying. The number of terms (--nterms) is the maximum $m$ for which the amplitudes are generated; 5 should give about 24 scalar values.

The amplitudes are labeled alpha[$s$,$m$]</code> and <code>beta[$s$,$m$] in the outout CSV file. One should focus on predicting the amplitudes for low values of $m$ first. The file also reproduces the source parameters, and the centre of mass $(x,y)$ in the original co-ordinate system using image coordinates with the origin in the upper left corner.

The most interesting lens model for this exercise is PsiFunctionSIS (fs), which gives the most accurate computations. The roulette amplitudes have not been implemented for any of the point mass lenses yet, and it also does not work for «SIS (rotated)» which is a legacy implementation of the roulette model with SIS and functionally equivalent to «Roulette SIS« (rs).

Warning This has yet to be tested properly.

Other scripts

The python/ directory contains scripts which do not depend on C++ code.

  • compare.py is used to compare images in the Regression Tests.
  • Several scripts used to calculate roulette amplitudes.

Versions

  • The imortant git branches are
    • develop is the current state of the art
    • master should be the last stable version
  • Releases
    • v-test-* are test releases, used to debug workflows. Please ignore.
    • see the releases on githun and CHANGELOG.md
  • Prior to v2.0.0 some releases have been tagged, but not registered as releases in github.
    • v0.1.0, v0.2.0, v1.0.0 are versions made by the u/g students Spring 2022.
    • v1.0.1 is cleaned up to be able to build v1.0.0

Caveats

The simulator makes numerical calculations and there will always be approximation errors.

  1. The images generated from the same parameters have changed slightly between version. Some changes are because some unfortunate uses of integers and single-precision numbers have later been avoided, and some simply because the order of calculation has changed.
  2. The SIS model is implemented in two versions, one rotating to have the source on the x-axis and one working directly with arbitrary position. Difference are not perceptible by visual comparison, but the difference image shows noticeable difference.

Contributors

  • Idea and Current Maintainance Hans Georg Schaathun hasc@ntnu.no
  • Mathematical Models Ben David Normann
  • Initial Prototype Simon Ingebrigtsen, Sondre Westbø Remøy, Einar Leite Austnes, and Simon Nedreberg Runde

About

Original repo from the BSc project Spring 2022

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 41.3%
  • C++ 33.5%
  • Shell 18.5%
  • Makefile 2.5%
  • Dockerfile 1.7%
  • CMake 1.4%
  • Other 1.1%