# <font face="Helvetica" size="7">Data Challenge 2: Assistance Notebook</font>  

<hr style="border: 1.5pt solid #a859e4; width: 100%; margin-top: -10px;">

<i> Authors: Amber Malpas, Katarzyna Kruszyńska, Etienne Bachalet, Ali Crisp </i>

<br>

If you would like an introduction to python notebooks, please read this tutorial: https://medium.com/codingthesmartway-com-blog/getting-started-with-jupyter-notebook-for-python-4e7082bd5d46

## <font face="Helvetica" size="6"> Installation and Set-Up </font>

<hr style="border: 1.5pt solid #a859e4; width: 100%; margin-top: -10px;">

Please note, you must **save this notebook in a space owned by you** (a GitHub repo, gist, to Google Drive, or locally) if you want to come back to it later without losing your progress. You can edit and run this notebook on Colab, but it **will not auto save** for you.

If you choose to use lcoal resources your notebook will use your local packages, so you should follow install a virtual environment with the following packages. Run the cell below to create a downloadable `.yml` file, to automate the package install process (provided you are using anaconda).

In [None]:
yaml = '''name: roman_fit
channels:
  - conda-forge
dependencies:
  - python=3.11
  - numpy
  - matplotlib
  - pandas
  - scipy
  - jupyter
  - ipython
  - astropy
  - beautifulsoup4
  - lxml              # required parser for bs4
  - pip
  - pip:
      - pathos
      - MulensModel'''

# save the yaml
with open('environment.yml', 'w') as f:
    f.write(yaml)

Click the folder button on the side bar to open the file explorer. The file `environment.yml` should be in there now. Just click the triple dots on the side and then `Download` to download the `.yml` file.

```bash
conda env create -f environment.yml
```

Running the above line in a terminal (Anaconda Prompt on Windows) will create a virtual conda environment called `minicourse`, which has the required packages installed.

You can activate the environment with:

```bash
conda activate minicourse
```

From here you have two options

1. You can open the notebook running
```bash
jupyter notebook
```
from a parent folder to your locally saved version of this notebook and navigating to the notebook in your browser. You may need to select `minicourse` as your kernel before running the notebook.

2. Alternatly, you can create a local "Runtime" and for your Colab notebook by following [these instructions](https://www.google.com/url?q=https%3A%2F%2Fresearch.google.com%2Fcolaboratory%2Flocal-runtimes.html).
```bash
jupyter notebook --NotebookApp.allow_origin='https://colab.research.google.com' --port=8888 --no-browser
```

  ⚠️ We recommend you take care when doing this with notebooks that you didn't write as it gives them access to your local machine.

Before continuing with this notebook, **please run the following import and set-up cells** by pressing the play button or `SHFT + ENTR`.

<!--
## <font face="Helvetica" size="6"> Dev Notes </font>
<hr style="border: 1.5pt solid #fc3d21; width: 100%; margin-top: -10px;">
-->



In [None]:
#@title General Imports

# system tools
import os
import sys
from io import StringIO
import time
from typing import Tuple, Callable, Optional, List
import shutil

# data analysis tools
import numpy as np
import matplotlib.pyplot as plt
from IPython import get_ipython
from IPython.display import display
from scipy.optimize import minimize
import astropy.units as u
from astropy.coordinates import Angle, SkyCoord
try:
    from google.colab import sheets  # will only work if you are running on Colab
except:
    pass

# web scrapping tools
import bs4 as bs
import urllib
import urllib.request
import pandas as pd

# parallel processing tools
!pip install pathos
from pathos.multiprocessing import ProcessingPool as Pool  # for multiprocessing inside jupyter
import multiprocessing as mp  # Ensure this is imported



In [None]:
#@title Store and reset the notebooks working directory
# This cell restores the working directory in case of failure.
# Make sure you run this cell before running the MulensModel package fix

def reset_cwd():
  # Declare _cwd as global within the function's scope
  global _cwd

  cwd = os.getcwd()
  print('current working directory:',cwd)
  try:
      # Use the global _cwd
      os.chdir(_cwd)
  except NameError:
      # _cwd is not yet defined, so initialize it
      _cwd = cwd  # saves the stored _cwd to the wd when the cell/function is first run
      os.chdir(_cwd) # change to the newly initialized directory

  print('working directory reset to:', _cwd)

  return _cwd

cwd = reset_cwd()

current working directory: /content
working directory reset to: /content


## <font face="Helvetica" size="6"> Introduction </font>

<hr style="border: 1.5pt solid #a859e4; width: 100%; margin-top: -10px;">

Welcome to the **Data Challenge: Assistance Notebook**.

This notebook is brought to you by the RGES-PIT and is inteneded to be an instroductory workbook for users new to microlensing event fitting, users who would like a refresher, or users who would like a comprehensive introduction to tools they have not used before.

The data challenge is intended to be a semi-realistic representation of the data volume and type expected from the Roman Galactic Bulge Time Domain Survey. Specifically for microlensing events and microlensing false positives.  

Our aim is to provide you with a realistic view of what working with bulk microlensing data involves. This tool is designed to help you build confidence in managing large datasets and using etablished fitting tools.

### <font face="Helvetica" size="5"> What data are used in this notebook? </font>

This notebook primarily uses lightcurves from [Data Challenge 1](https://www.microlensing-source.org/data-challenge/). None of the fits inside this notebook are in anyway performing the tasks of Data Challenge 2. The data will be [cloned](https://docs.github.com/en/repositories/creating-and-managing-repositories/cloning-a-repository) and sorted in later sections. Because Data Challenge 1 concluded in 2018, we can make full use of the [parameter "truths"](https://en.wikipedia.org/wiki/Statistical_parameter#:~:text=parameter%20describes%20the-,true%20value,-calculated%20from%20the) for each event to verify our fitting processes.

The dataset consists of two lightcurve files for each event or star, representing the data from Roman's `W149` and `Z087` filters. The files are in ASCII format with the columns BJD, Aperture_Magnitude and Error, and follow the file-naming convention: `ulwdc1_nnn_[W149/Z087].txt`

Supplementary files were also provided including `wfirst_ephemeris.txt`, which contains the `BJD` and 3D spacecraft location within the solar system. Information was provided on the surface-brightness color relation for `Z087-W149` to enable lens masses to be determined where applicable.

It should be noted that in the simulated data, the inertial frame of reference was defined with the $x$-axis increasing from the binary center of mass towards the less massive lens at `t0`, the time of closest approach to the center of mass. If viewed from the solar system barycenter, the inertial frame moves at the relative velocity `vlens_CoM - vobserver(t0)`. The inclination of the orbit is a counter-clockwise rotation about the $x$-axis. $\alpha$ is the angle that the source trajectory made with the $x$-axis (if parallax was 0). Where finite source effects were significant, a linear limb darkening law was applied.

### <font face="Helvetica" size="5"> What lightcurve models are included in this notebook? </font>

The first data challenge contained the following kinds of lightcurves:

* Cataclysmic Variable Star (CVS) false positive
* Single lens microlensing events
* Binary lens microlensing events

You can expect a much broader set in the current data challenge with the addition of binary source microlensing events and more higher-order effects, including parallax, lens orbital motion and Xallarap. Higher-order effects included in the first data challenge, which are still included in this data challenge, are finite source effects and ... .

Additional data types included in Data challenge 2 include:
* Astrometric timeseries
* Image postage stamps

### <font face="Helvetica" size="5"> What fitting codes are used in this notebook? </font>

Some open source microlensing codes:

| Name | Notes | Maintained | Link | Covered here |
| :-: | :-: | :-: | :-: | :-: |
| MuLensModel | User friendly, single- and binary-lens fitting code. | Yes ([Poleski](https://github.com/rpoleski)) | [GitHub](https://github.com/rpoleski/MulensModel) | [Yes](#mulensmodel) |
| BAGEL | Incorporates photometric and astrometric microlensing. | Yes ([Moving Universe Lab](https://github.com/MovingUniverseLab)) | [GitHub](https://github.com/MovingUniverseLab/BAGLE_Microlensing) | [Yes](#bagel) |
| VBMicroelensing | A more general version of the binary-lens code [VBBL](https://github.com/valboz/VBBinaryLensing). VBMicrolensing <br>is a tool for efficient computation in gravitational microlensing events <br>using the advanced contour integration method, supporting single, binary <br>and multiple lenses. | Yes ([Bozza](https://github.com/valboz)) | [GitHub](https://github.com/valboz/VBMicrolensing) | [Yes](#vbmicrolensing)
| pyLIMA | pyLIMA is the first open source software for modeling microlensing <br>events. It should be flexible enough to handle your data and fit it. You can <br>also practice by simulating events. Useful for space-based observations. | Yes ([Bachelet](https://github.com/ebachelet)) | [GitHub](https://github.com/ebachelet/pyLIMA) | [Yes](#pylima) |
| RTModel | Hands-off model fitting with built in model <i>"interpretation"</i> <br>(e.g. determing single-lens vs binary-lens arrangement) | Yes ([Bozza](https://github.com/valboz)) | [Github](https://github.com/valboz/RTModel) | [Yes](RTModel) |
| eesunhong | No general description found. <br>See [Bennett and Rhie (1996)](https://ui.adsabs.harvard.edu/abs/1996ApJ...472..660B/abstract) and [Bennett (2010)](https://ui.adsabs.harvard.edu/abs/2010ApJ...716.1408B/abstract) | Yes ([Bennett]()) | [GitHub](https://github.com/golmschenk/eesunhong) | No |
| pyLIMASS | Addition to pyLIMA for estimating physical properties of the lens <br>system. See [Bachelet, Hundertmark, and Calchi Novati (2024)](https://ui.adsabs.harvard.edu/abs/2024AJ....168...24B/abstract) | Yes ([Bachelet](https://github.com/ebachelet)) | [GitHub](https://github.com/ebachelet/pyLIMA/tree/master/pyLIMA/pyLIMASS) | [Yes](#pylima) |
| popclass | Provides a flexible, probabilistic framework for classifying the lens <br>of a gravitational microlensing event. | Yes ([LLNL](https://github.com/LLNL)) | [GitHub](https://github.com/LLNL/popclass) | [Yes](#popclass) |
| muLAn | Designed for fitting Roman microlensing lightcurve data. | No ([Cassan](https://github.com/ArnaudCassan)/[Ranc](https://github.com/clementranc)) | [GitHub](https://github.com/muLAn-project/muLAn) | [Yes](#muLAN) |
| triplelens | Calculates light curves and image positions for triple microlensing <br>systems. (When the mass ratio is small (below ~ 1e-5), the solutions <br>from the lens equation solver are more accurate when the origin <br>of the coordinate system is set to be close to the smallest mass.) | No ([Kuang](https://github.com/rkkuang)) | [GitHub](https://github.com/rkkuang/triplelens) | No |
| SingleLensFitter | Fits single lens events with finite source effects | No ([Albrow](https://github.com/MichaelDAlbrow)) | [GitHub](https://github.com/MichaelDAlbrow/SingleLensFitter) | No |
| GullsPoteriors | Collects posteriors for simulated microlensing events | No ([Malpas](https://github.com/AmberLee2427)) | [GitHub](https://github.com/AmberLee2427/GullsPosteriors) | No |
|
<br>

### <font face="Helvetica" size="5"> Microlensing learning resources </font>

* [RGES-PIT Minicourse](https://rges-pit.org/outreach_mini_landing/)

  The RGES PIT has developed a microlensing mini course for select students to participate in various Roman-related lectures and activities during the Summer of 2025. The virtual lectures were held in mid May 2025; you can find lecture materials, recordings, and assignments.

* [Microlensing Source](https://www.microlensing-source.org/)

  Microlensing Source is a resource center for all aspects of gravitational microlensing. It aims to make microlensing more accessible for anyone with an interest in the subject - including students considering a career in the field, citizen scientists and those looking for a ready reference.

* [The Microlenser's Guide to the Galaxy](https://github.com/AmberLee2427/TheMicrolensersGuideToTheGalaxy)

  The goal of this project is to create an all-encompassing collection of Jupyter notebooks—your trusty companions for engaging exercises related to microlensing. Through these notebooks, the insights and experiences of microlensing veterins can light your path as you embark on your journey of discovery and exploration through scientific research.

* [2017 Sagan Workshop](http://nexsci.caltech.edu/workshop/2017/)

  The 2017 Sagan Summer Workshop focus on searching for planets with Roman (previously known as WFIRST) microlensing. Leaders in the field will discussed the importance of microlensing to understanding planetary populations and demographics, especially beyond the snow line. They reviewed the microlensing method, both in the context of current capabilities and the future Roman microlensing survey. In addition, speakers addressed the broad potential of the Romans's Wide Field Imaging microlensing survey for (non-microlensing) science in the galactic bulge. Attendees participated in hands-on group projects related to the Roman microlensing planet survey and had the opportunity to present their own work through short presentations (research POPs) and posters.
  The recordings from this workshop can be found [here](https://www.youtube.com/watch?v=QPfKucBb9B8&list=PLIbTYGsIVYthWRS14eCEK8SK9IOTcaYsf)

* [Glossary of Terms](https://www.microlensing-source.org/glossary/)

  This glossary, from Microlensing Source, is intended as a quick reference, particularly to disambiguate the different symbol sets used by different authors over time. Interested readers are referred to the references at the bottom for a full discussion, especially Skowron et al. (2011), and to the Learning Resources menu.



## <font face="Helvetica" size="6"> Collecting the Data </font>

<hr style="border: 1.5pt solid #a859e4; width: 100%; margin-top: -10px;">

You can collapse this section and blindly click the play button to run all 13 cells in this section, which will download the data and organize it into dataframes.

In [None]:
#@title Cloning the GitHub repository

# clone the microlensing data challenge repo
!git clone https://github.com/microlensing-data-challenge/data-challenge-1.git

# Extract the lightcurve files
!tar -xzvf data-challenge-1/lc.tar.gz -C data-challenge-1/

fatal: destination path 'data-challenge-1' already exists and is not an empty directory.
lc/
lc/ulwdc1_001_W149.txt
lc/ulwdc1_001_Z087.txt
lc/ulwdc1_002_W149.txt
lc/ulwdc1_002_Z087.txt
lc/ulwdc1_003_W149.txt
lc/ulwdc1_003_Z087.txt
lc/ulwdc1_004_W149.txt
lc/ulwdc1_004_Z087.txt
lc/ulwdc1_005_W149.txt
lc/ulwdc1_005_Z087.txt
lc/ulwdc1_006_W149.txt
lc/ulwdc1_006_Z087.txt
lc/ulwdc1_007_W149.txt
lc/ulwdc1_007_Z087.txt
lc/ulwdc1_008_W149.txt
lc/ulwdc1_008_Z087.txt
lc/ulwdc1_009_W149.txt
lc/ulwdc1_009_Z087.txt
lc/ulwdc1_010_W149.txt
lc/ulwdc1_010_Z087.txt
lc/ulwdc1_011_W149.txt
lc/ulwdc1_011_Z087.txt
lc/ulwdc1_012_W149.txt
lc/ulwdc1_012_Z087.txt
lc/ulwdc1_013_W149.txt
lc/ulwdc1_013_Z087.txt
lc/ulwdc1_014_W149.txt
lc/ulwdc1_014_Z087.txt
lc/ulwdc1_015_W149.txt
lc/ulwdc1_015_Z087.txt
lc/ulwdc1_016_W149.txt
lc/ulwdc1_016_Z087.txt
lc/ulwdc1_017_W149.txt
lc/ulwdc1_017_Z087.txt
lc/ulwdc1_018_W149.txt
lc/ulwdc1_018_Z087.txt
lc/ulwdc1_019_W149.txt
lc/ulwdc1_019_Z087.txt
lc/ulwdc1_020_W149.txt
lc/ulwdc1_

In [None]:
#@title Displaying PDFs in a notebook (browser dependent compatability)
#from IPython.display import IFrame
#
## Assuming the PDF is in the current working directory
#pdf_path = "data-challenge-1/Answers/DataChallenge2019_Summary_byJenniferYee.pdf"
#
## Display the PDF using IFrame
#IFrame(pdf_path, width=800, height=600)

### <font face="Helvetica" size="5"> Single Lens Events </font>

This dataset includes 293 lightcurve, 74 of which are single lens events. We can cheat a little and specifically pull out the events that we know to be single lenses, keeping the challenge tractable for completion within the hour, with the added benefit of making the strangley organized `master_file.txt` easier to wrangle.

In [None]:
#@title Putting everything in a tidy data frame

master_file = '/content/data-challenge-1/Answers/master_file.txt'
header_file = '/content/data-challenge-1/Answers/wfirstColumnNumbers.txt'

rows = []
with open(master_file, "r") as f:
    for line in f:
        line = line.strip()
        # Skip empty lines or comment lines
        if not line or line.startswith("#"):
            continue

        tokens = line.split()  # split on whitespace
        # Keep only single-lens events
        if "dcnormffp" not in tokens:
            continue

        # Single-lens lines should have exactly 96 columns
        if len(tokens) != 96:
            continue

        rows.append(tokens)

df_sl = pd.DataFrame(rows)

# make an array of zeros with 97 elements
colnames_96 = np.zeros(96, dtype=object)

# Read the header file
with open(header_file, 'r') as f:
    for line in f:
        line = line.strip()
        # Skip empty lines or comments
        if not line or line.startswith('#'):
            continue
        # The second token is the 'name'
        parts = line.split()
        colnames_96[int(parts[0])] = parts[1]

#For single lenses they are (***Note for these, the mass of the lens is given by the planet mass column, not the host mass column):
#72 - unimportant
#73 - N, number of consecutive W149 data points deviating by >=3 sigma from a flat line
#74 - unimportant
#75 - Delta chi^2 (relative to a flat line)
#76-91 - unimportant
#92 - simulated event type (dcnormffp = single lens or free-floating planet)
#93 - unimportant (I think)
#94 - lightcurve filename root
#95 - Data challenge lightcurve number

# Replace the column names in colnames_96
colnames_96[73] = 'N'
colnames_96[75] = 'Delta chi2'
colnames_96[92] = 'sim type'
colnames_96[94] = 'filename'
colnames_96[95] = 'lc_number'

# Make sure the column names are unique
for i in range(94):
    if colnames_96[i] == '|' or colnames_96[i] == 0:
        colnames_96[i] = 'col_' + str(i)

# Replace the column names in the data_frame
df_sl.columns = colnames_96

# Remove the dummy columns 'col_*'
df_sl = df_sl.loc[:, ~df_sl.columns.str.startswith('col_')]

df_sl

Unnamed: 0,idx,subrun,field,l,b,ra,dec,src_id,Ds,Rs,...,sigma_q,sigma_rs,sigma_F00,sigma_fs0,sigma_F01,sigma_fs1,sigma_thetaE,sim type,filename,lc_number
0,1694,0,82,1.17028,-2.26944,269.319,-29.0889,9926,7.939,0.212,...,999999,999999,8905.98,76994.9,7414.03,3.44036e+09,-2.69363e+11,dcnormffp,dcnormffp_0_82_1694,5
1,539,0,82,1.05037,-2.20017,269.182,-29.158,16904,10.253,0.361,...,999999,999999,728415,435.739,50.4485,166721,-6.04003e+06,dcnormffp,dcnormffp_0_82_539,17
2,1278,0,82,1.18168,-2.1191,269.177,-29.0038,17351,10.64,0.375,...,999999,999999,2.78749e+06,629.296,7.91374,91347.9,-1.23727e+07,dcnormffp,dcnormffp_0_82_1278,21
3,1479,0,82,1.13373,-2.23737,269.267,-29.1045,8329,7.427,0.212,...,999999,999999,6.98775e+06,250.186,78.6551,59875.6,-979126,dcnormffp,dcnormffp_0_82_1479,22
4,318,0,82,1.08784,-2.21568,269.219,-29.1334,16773,10.169,0.535,...,999999,999999,3.11924e+06,2.92871,0.163627,1.18527,-215.015,dcnormffp,dcnormffp_0_82_318,29
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,718,0,82,1.17865,-2.19791,269.253,-29.0459,11807,8.458,0.293,...,999999,999999,592405,6337.05,10.9393,8.54477e+06,-1.21792e+09,dcnormffp,dcnormffp_0_82_718,285
70,1378,0,82,1.04452,-2.21048,269.189,-29.1682,2183,7.966,1.576,...,999999,999999,809.398,228.589,1.10915,21.9532,21943.9,dcnormffp,dcnormffp_0_82_1378,286
71,954,0,82,1.04501,-2.16747,269.147,-29.1462,14473,9.202,0.328,...,999999,999999,325492,21090.4,1237.92,1.09519e+08,-1.3671e+10,dcnormffp,dcnormffp_0_82_954,288
72,259,0,82,1.18813,-2.0865,269.149,-28.9819,10930,8.206,0.375,...,999999,999999,2.07119e+06,3929.98,34.1225,1.19947e+07,-5.83016e+08,dcnormffp,dcnormffp_0_82_259,290


The last column in this data frame has the lightcurve number, which we can use to pick out only the lightcurves matching our single-lens event list, for analysis.

In [None]:
#@title Figuring out which files we want

lc_number = df_sl['lc_number'].to_numpy()

lc_file_path_format = 'data-challenge-1/lc/ulwdc1_XXX_filter.txt'

lc_file_paths_W149 = [lc_file_path_format.replace('filter', 'W149')] * len(lc_number)
lc_file_paths_Z087 = [lc_file_path_format.replace('filter', 'Z087')] * len(lc_number)

# replace XXX, from the right, with the lc_number which is not necessarily of length 3
lc_file_paths_W149 = [path.replace('XXX', str(num).zfill(3)) for path, num in zip(lc_file_paths_W149, lc_number)]
lc_file_paths_Z087 = [path.replace('XXX', str(num).zfill(3)) for path, num in zip(lc_file_paths_Z087, lc_number)]

df_sl['lc_file_path_W149'] = lc_file_paths_W149
df_sl['lc_file_path_Z087'] = lc_file_paths_Z087

df_sl

Unnamed: 0,idx,subrun,field,l,b,ra,dec,src_id,Ds,Rs,...,sigma_F00,sigma_fs0,sigma_F01,sigma_fs1,sigma_thetaE,sim type,filename,lc_number,lc_file_path_W149,lc_file_path_Z087
0,1694,0,82,1.17028,-2.26944,269.319,-29.0889,9926,7.939,0.212,...,8905.98,76994.9,7414.03,3.44036e+09,-2.69363e+11,dcnormffp,dcnormffp_0_82_1694,5,data-challenge-1/lc/ulwdc1_005_W149.txt,data-challenge-1/lc/ulwdc1_005_Z087.txt
1,539,0,82,1.05037,-2.20017,269.182,-29.158,16904,10.253,0.361,...,728415,435.739,50.4485,166721,-6.04003e+06,dcnormffp,dcnormffp_0_82_539,17,data-challenge-1/lc/ulwdc1_017_W149.txt,data-challenge-1/lc/ulwdc1_017_Z087.txt
2,1278,0,82,1.18168,-2.1191,269.177,-29.0038,17351,10.64,0.375,...,2.78749e+06,629.296,7.91374,91347.9,-1.23727e+07,dcnormffp,dcnormffp_0_82_1278,21,data-challenge-1/lc/ulwdc1_021_W149.txt,data-challenge-1/lc/ulwdc1_021_Z087.txt
3,1479,0,82,1.13373,-2.23737,269.267,-29.1045,8329,7.427,0.212,...,6.98775e+06,250.186,78.6551,59875.6,-979126,dcnormffp,dcnormffp_0_82_1479,22,data-challenge-1/lc/ulwdc1_022_W149.txt,data-challenge-1/lc/ulwdc1_022_Z087.txt
4,318,0,82,1.08784,-2.21568,269.219,-29.1334,16773,10.169,0.535,...,3.11924e+06,2.92871,0.163627,1.18527,-215.015,dcnormffp,dcnormffp_0_82_318,29,data-challenge-1/lc/ulwdc1_029_W149.txt,data-challenge-1/lc/ulwdc1_029_Z087.txt
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,718,0,82,1.17865,-2.19791,269.253,-29.0459,11807,8.458,0.293,...,592405,6337.05,10.9393,8.54477e+06,-1.21792e+09,dcnormffp,dcnormffp_0_82_718,285,data-challenge-1/lc/ulwdc1_285_W149.txt,data-challenge-1/lc/ulwdc1_285_Z087.txt
70,1378,0,82,1.04452,-2.21048,269.189,-29.1682,2183,7.966,1.576,...,809.398,228.589,1.10915,21.9532,21943.9,dcnormffp,dcnormffp_0_82_1378,286,data-challenge-1/lc/ulwdc1_286_W149.txt,data-challenge-1/lc/ulwdc1_286_Z087.txt
71,954,0,82,1.04501,-2.16747,269.147,-29.1462,14473,9.202,0.328,...,325492,21090.4,1237.92,1.09519e+08,-1.3671e+10,dcnormffp,dcnormffp_0_82_954,288,data-challenge-1/lc/ulwdc1_288_W149.txt,data-challenge-1/lc/ulwdc1_288_Z087.txt
72,259,0,82,1.18813,-2.0865,269.149,-28.9819,10930,8.206,0.375,...,2.07119e+06,3929.98,34.1225,1.19947e+07,-5.83016e+08,dcnormffp,dcnormffp_0_82_259,290,data-challenge-1/lc/ulwdc1_290_W149.txt,data-challenge-1/lc/ulwdc1_290_Z087.txt


There are a few pieces of information that may need to be known for each event that are not in the lightcurve files. These are stored in event_info.txt

Columns: `"Event_name"` `"Event_number"` `"RA_(deg)"` `"Dec_(deg)"` `"Distance"` `"A_W149"` `"sigma_A_W149"` `"A_Z087"` `"sigma_A_Z087"`

Distance, A_W149/Z087 are an estimate of the distance and extinction in each band of the red clump stars. sigma_A_W149/Z087 are dispersions in the extinction.

In [None]:
#@title Event information data frame

header = ["Event_name",
          "Event_number",
          "RA_(deg)",
          "Dec_(deg)",
          "Distance",
          "A_W149",
          "sigma_A_W149",
          "A_Z087",
          "sigma_A_Z087"
]

event_info = pd.read_csv('./data-challenge-1/event_info.txt', names=header, delim_whitespace=True)
event_info

  event_info = pd.read_csv('./data-challenge-1/event_info.txt', names=header, delim_whitespace=True)


Unnamed: 0,Event_name,Event_number,RA_(deg),Dec_(deg),Distance,A_W149,sigma_A_W149,A_Z087,sigma_A_Z087
0,ulwdc1_001,1,269.165,-29.0207,8.18,0.73,0.01,1.41,0.01
1,ulwdc1_002,2,269.959,-30.1918,8.09,0.49,0.01,0.95,0.01
2,ulwdc1_003,3,269.100,-29.0983,8.18,0.73,0.01,1.41,0.01
3,ulwdc1_004,4,268.036,-28.3744,8.25,1.35,0.07,2.60,0.14
4,ulwdc1_005,5,269.319,-29.0889,8.18,0.73,0.01,1.41,0.01
...,...,...,...,...,...,...,...,...,...
288,ulwdc1_289,289,267.813,-28.6965,8.07,1.52,0.07,2.93,0.14
289,ulwdc1_290,290,269.149,-28.9819,8.18,0.73,0.01,1.41,0.01
290,ulwdc1_291,291,267.999,-29.7203,8.57,0.71,0.01,1.36,0.01
291,ulwdc1_292,292,269.151,-29.1433,8.18,0.73,0.01,1.41,0.01


In [None]:
#@title Combining the two data frames

# Convert 'lc_number' to numeric type before merging
merged_sl_df = pd.merge(event_info, df_sl.astype({'lc_number': 'int64'}), left_on='Event_number', right_on='lc_number', how='inner')
merged_sl_df

Unnamed: 0,Event_name,Event_number,RA_(deg),Dec_(deg),Distance,A_W149,sigma_A_W149,A_Z087,sigma_A_Z087,idx,...,sigma_F00,sigma_fs0,sigma_F01,sigma_fs1,sigma_thetaE,sim type,filename,lc_number,lc_file_path_W149,lc_file_path_Z087
0,ulwdc1_005,5,269.319,-29.0889,8.18,0.73,0.01,1.41,0.01,1694,...,8905.98,76994.9,7414.03,3.44036e+09,-2.69363e+11,dcnormffp,dcnormffp_0_82_1694,5,data-challenge-1/lc/ulwdc1_005_W149.txt,data-challenge-1/lc/ulwdc1_005_Z087.txt
1,ulwdc1_017,17,269.182,-29.1580,8.18,0.73,0.01,1.41,0.01,539,...,728415,435.739,50.4485,166721,-6.04003e+06,dcnormffp,dcnormffp_0_82_539,17,data-challenge-1/lc/ulwdc1_017_W149.txt,data-challenge-1/lc/ulwdc1_017_Z087.txt
2,ulwdc1_021,21,269.177,-29.0038,8.18,0.73,0.01,1.41,0.01,1278,...,2.78749e+06,629.296,7.91374,91347.9,-1.23727e+07,dcnormffp,dcnormffp_0_82_1278,21,data-challenge-1/lc/ulwdc1_021_W149.txt,data-challenge-1/lc/ulwdc1_021_Z087.txt
3,ulwdc1_022,22,269.267,-29.1045,8.18,0.73,0.01,1.41,0.01,1479,...,6.98775e+06,250.186,78.6551,59875.6,-979126,dcnormffp,dcnormffp_0_82_1479,22,data-challenge-1/lc/ulwdc1_022_W149.txt,data-challenge-1/lc/ulwdc1_022_Z087.txt
4,ulwdc1_029,29,269.219,-29.1334,8.18,0.73,0.01,1.41,0.01,318,...,3.11924e+06,2.92871,0.163627,1.18527,-215.015,dcnormffp,dcnormffp_0_82_318,29,data-challenge-1/lc/ulwdc1_029_W149.txt,data-challenge-1/lc/ulwdc1_029_Z087.txt
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
69,ulwdc1_285,285,269.253,-29.0459,8.18,0.73,0.01,1.41,0.01,718,...,592405,6337.05,10.9393,8.54477e+06,-1.21792e+09,dcnormffp,dcnormffp_0_82_718,285,data-challenge-1/lc/ulwdc1_285_W149.txt,data-challenge-1/lc/ulwdc1_285_Z087.txt
70,ulwdc1_286,286,269.189,-29.1682,8.18,0.73,0.01,1.41,0.01,1378,...,809.398,228.589,1.10915,21.9532,21943.9,dcnormffp,dcnormffp_0_82_1378,286,data-challenge-1/lc/ulwdc1_286_W149.txt,data-challenge-1/lc/ulwdc1_286_Z087.txt
71,ulwdc1_288,288,269.147,-29.1462,8.18,0.73,0.01,1.41,0.01,954,...,325492,21090.4,1237.92,1.09519e+08,-1.3671e+10,dcnormffp,dcnormffp_0_82_954,288,data-challenge-1/lc/ulwdc1_288_W149.txt,data-challenge-1/lc/ulwdc1_288_Z087.txt
72,ulwdc1_290,290,269.149,-28.9819,8.18,0.73,0.01,1.41,0.01,259,...,2.07119e+06,3929.98,34.1225,1.19947e+07,-5.83016e+08,dcnormffp,dcnormffp_0_82_259,290,data-challenge-1/lc/ulwdc1_290_W149.txt,data-challenge-1/lc/ulwdc1_290_Z087.txt


### <font face="Helvetica" size="5"> Binary Lens Events </font>

### <font face="Helvetica" size="5"> Triple Lens Events </font>


In [None]:
try:  # this will only work on Colab.
  sl_sheet = sheets.InteractiveSheet(df=merged_sl_df)
  sl_sheet.show()
  bl_sheet = sheets.InteractiveSheet(df=df_bl)
  bl_sheet.show()
except:
  pass



https://docs.google.com/spreadsheets/d/1Bfn0jsEG94v9JDW3b5326cDUMTEKZxrQa26iEFM5lpA/edit#gid=0


Great - data successfully wrangled. Let's forget we ever had to live through that and move right along.

## <font face="Helvetica" size="6"> Packages Covered in This Notebook </font>

<hr style="border: 1.5pt solid #a859e4; width: 100%; margin-top: -10px;">

<details>
<summary><font face="Helvetica" size="5">1. MulensModel</font></summary>

[Go to the `MulensModel` section](#mulensmodel)
  * [1.1 Data objects](#11-data-objects)
  * [1.2 Model objects](#12-model-objects)
  * [1.3 Event objects](#13-event-objetcs)
  * 1.4 Fitting
  * 1.5 Higher-order effects
    - 2S1L
    - 1S2L
    - LOM
    - Xallarap
</details>
<details>
<summary><font face="Helvetica" size="5">2. popclass </font></summary>

[Go to the `popclass` section](#popclass)

</details>
<details>
<summary><font face="Helvetica" size="5">3. pyLIMA</font></summary>

[Go to the `pyLIMA` section](#pylima)
</details>
<details>
<summary><font face="Helvetica" size="5">4. RTModel</font></summary>

[Go to the `RTModel` section](#rtmodel)
</details>
<details>
<summary><font face="Helvetica" size="5">5. BAGEL</font></summary>

[Go to the `BAGEL` section](#bagel)
</details>
<details>
<summary><font face="Helvetica" size="5">6. MuLAN</font></summary>

[Go to the `MuLAN` section](#mulan)
</details>

## <font face="Helvetica" size="6"> 1. MulensModel </font>

<hr style="border: 1.5pt solid #a859e4; width: 100%; margin-top: -10px;">

At the time of writting this notebook there was an unresolved bug in the `MulensModel` package. The result is that models with finte source effects will raise an error for missing files. You can run all cells in the following subsection to correct it. But basically we need to download the repo's data directory and replace it in the package file.

#### <font face="Helvetica" size="5"> MulensModel package fix

In [None]:
#@title Installing the package
reset_cwd()
!pip install MulensModel

current working directory: /content
working directory reset to: /content


In [None]:
#@title Importing the package
import MulensModel as mm

In [None]:
#@title Clearing the old `data` directory/file
# check this box if you would like to replace the data folder
replace_data_folder = False #@param {type:"boolean"}
skip_next_cell = False

mulensmodel_dir = os.path.dirname(mm.__file__)
data_file_path = os.path.join(mulensmodel_dir, 'data')

if os.path.exists(data_file_path):
  # make sure we have permissions to delete the data file
  os.chmod(os.path.join(mulensmodel_dir, "data"), 0o777)

  if os.path.isfile(data_file_path):
    os.remove(data_file_path)
    print(f"Removed 'data' file from {mulensmodel_dir}")
  elif replace_data_folder:
    shutil.rmtree(data_file_path)
    print(f"Removed 'data' directory from {mulensmodel_dir}")
  else:
    skip_next_cell = True
    print("""MulensModel build looks correct. If it is not working, try selecting
     `replace_data_folder` and running the subsection again.""")
else:
  print(f"No 'data' file or directory found in {mulensmodel_dir}")

MulensModel build looks correct. If it is not working, try selecting
     `replace_data_folder` and running the subsection again.


In [None]:
#@title Getting the data tree and cleaning up
if not skip_next_cell:
    # change the working directory to the mulensmodel_dir directory
    os.chdir(mulensmodel_dir)

    # starting from a fresh clone
    if os.path.exists(os.path.join(mulensmodel_dir, "MulensModel")):
        # remove the MulensModel repository
        !chmod -R u+w MulensModel
        !rm -r MulensModel

    # clone the MulensModel repository
    !git clone https://github.com/rpoleski/MulensModel.git

    # move the data directory to the mulensmodel_dir directory
    !mv MulensModel/data ./

    # remove the MulensModel repository
    !chmod -R u+w MulensModel
    !rm -r MulensModel

    # change the working directory back to the original directory
    reset_cwd()

### <font face="Helvetica" size="5"> 1.1 Data Objects </font>

When fitting Roman data with `MulensModel`, we need to make a minor adjustment to our model for data that is not ground based. In `MulensModel` this simply means adding the following keyword to the data object initialization: `ephemerides_file=PATH_TO_THE_FILE`.

> Instructions specific to this data set for `MulensModel` are given [here](https://github.com/rpoleski/MulensModel/blob/master/documents/data_challenge.md).

Most of the data for these events is in the W147 band, so we make the very reasonable decision to just fit those data and not have to deal with mutliple data sets with different $F_\textrm{S}$ and $F_\textrm{B}$ values. That should speen up our fits too. If we wanted to find the color of the source star at a later date we could fit just the flux parameters and leave the microlensing-model parameters fixed (as described in [this notebook](https://github.com/AmberLee2427/TheMicrolensersGuideToTheGalaxy/blob/3b783495eb9a916ee9670a0347c9325f6a5b0a21/Notebooks/SingleLens.ipynb)) using a linear regression, which would a fraction of a second per event.

In [None]:
#@title Collecting the relevent meta data
data_file = merged_sl_df['lc_file_path_W149'][0]
ra = merged_sl_df['RA_(deg)'][0]
dec = merged_sl_df['Dec_(deg)'][0]
print(ra, dec)

# convert decimal ra and dec in degrees to "17h57m16.56s -29d05m20.04s"
coord = SkyCoord(ra=ra * u.deg, dec=dec * u.deg, frame='icrs')
hms_dms_string = coord.to_string('hmsdms')
print(f"SkyCoord default: {hms_dms_string}")

In [None]:
#@title Example L2 `Data` object

# Here is the main difference for space data - we provide the ephemeris for Roman:
EPHEM_FILE = 'data-challenge-1/wfirst_ephemeris_W149.txt'
data_Roman_W149 = mm.MulensData(file_name=data_file,
                                phot_fmt='mag',
                                ephemerides_file=EPHEM_FILE,
                                plot_properties={'color': '#a859e4',
                                                 'label': 'Roman W149'
                                                 },
                                bandpass='H'
                               )

### <font face="Helvetica" size="5"> 1.2 Model Objects </font>

In [None]:
#@title Collecting the meta data
# split the parallax equally out of a lack of better ideas
pi_E_E = np.sqrt(float(merged_sl_df['piE'][0])**2 / 2.0)
pi_E_N = pi_E_E * 1.0

t_0 = float(merged_sl_df['t0'][0])
# Annoyingly, t_0 is in simulation time not HJD so we need to do a conversion
# (https://github.com/microlensing-data-challenge/evaluation_code/blob/master/parse_table1.py)
# line 402
t_0 = t_0 + 2458234.0  # simulation 0 time

In [None]:
#@title Making a dictionary of the guess parameters
# Let's just tidy the "guess" parameters up into a dictionary, for easy accesss
params = dict()
parameters_to_fit = ["t_0", "u_0", "t_E", "rho", "pi_E_N", "pi_E_E"]
params['t_0'] = t_0 * 1.0
params['t_0_par'] = t_0 * 1.0
params['u_0'] = float(merged_sl_df['u0'][0]) * 1.1
params['t_E'] = float(merged_sl_df['tE'][0]) * 1.1
params['rho'] = float(merged_sl_df['rhos'][0]) * 1.1
params['pi_E_N'] = pi_E_E
params['pi_E_E'] = pi_E_E

In [None]:
#@title Example `Model` object

# If we are using parallax, it is also important that we provide the event
# coordinates, or MulensModel can't do necessary calculations
Roman_model = mm.Model({**params},
                        coords=coord,
                        ephemerides_file=EPHEM_FILE
                       )

### <font face="Helvetica" size="5"> 1.3 Event Objects </font>

In [None]:
#@title Example `Event` object

Roman_event = mm.Event(datasets=data_Roman_W149, model=Roman_model)