# MPIA Arxiv on Deck 2: Debugging notebook

In this notebook, I keep some first order commands for diagnostic of issues with papers.
Main definitions are taken from the main notebook.

In [1]:
# Imports
import os
from IPython.display import Markdown, display
from tqdm.notebook import tqdm
import warnings
from PIL import Image 

# requires arxiv_on_deck_2

from arxiv_on_deck_2.arxiv2 import (get_new_papers, 
                                    get_paper_from_identifier,
                                    retrieve_document_source, 
                                    get_markdown_badge)
from arxiv_on_deck_2 import (latex,
                             latex_bib,
                             mpia,
                             highlight_authors_in_list)

# Sometimes images are really big
Image.MAX_IMAGE_PIXELS = 1000000000 

# Some useful definitions.
class AffiliationWarning(UserWarning):
    pass

class AffiliationError(RuntimeError):
    pass

def validation(source: str):
    """Raises error paper during parsing of source file
    
    Allows checks before parsing TeX code.
    
    Raises AffiliationWarning
    """
    check = mpia.affiliation_verifications(source, verbose=True)
    if check is not True:
        raise AffiliationError("mpia.affiliation_verifications: " + check)

        
warnings.simplefilter('always', AffiliationWarning)

We get the author list from the MPIA website

In [2]:
# Getting the list of authors can take sometimes (internet connection)
# Caching the MPIA author list to avoid running this line every time we restart the kernel.
import yaml
try:
    with open('tmp_mpia_authors.yml', 'r') as fin:
        mpia_authors = yaml.load(fin, yaml.BaseLoader)
    print("`mpia.get_mpia_mitarbeiter_list()`: restored from cache")
except FileNotFoundError:
    print("`mpia.get_mpia_mitarbeiter_list()`: cannot be restored from cache.")
    # get list from MPIA website
    # it automatically filters identified non-scientists :func:`mpia.filter_non_scientists`
    mpia_authors = mpia.get_mpia_mitarbeiter_list()
    with open('tmp_mpia_authors.yml', 'w') as fout:
        fout.write(yaml.dump(mpia_authors))

`mpia.get_mpia_mitarbeiter_list()`: restored from cache


In [3]:
# Get family names only
data = [k[-1] for k in mpia_authors]

# clean -- Having titles is something new at MPIA
data = [k.replace('Dr. ', '').strip() for k in data]

filtered_data = list(filter(mpia.filter_non_scientists, data))

name_variations = filter(lambda x: x is not None,
                         [mpia.consider_variations(name) for name in filtered_data])
mitarbeiter_list = sorted(filtered_data + list(name_variations))

lst = [(mpia.get_special_corrections(mpia.get_initials(name)), name) for name in mitarbeiter_list]
lst = [(mpia.family_name_from_initials(k[0]), k[0], k[1]) for k in lst]
lst = sorted(lst, key=lambda x: x[0])
lst = set(lst)

We get the paper to debug

In [4]:
which = "2303.02816"
paper = get_paper_from_identifier(which)
paper


|||
|---:|:---|
| [![arXiv](https://img.shields.io/badge/arXiv-2303.02816-b31b1b.svg)](https://arxiv.org/abs/2303.02816) | **Examining the Decline in the C IV Content of the Universe over 4.3 < z < 6.3 using the E-XQR-30 Sample**  |
|| Rebecca L. Davies, et al. |
|*Appeared on*| *2023-03-06*|
|*Comments*| *20 pages, 8 figures, 4 tables. Published in MNRAS*|
|**Abstract**| Intervening CIV absorbers are key tracers of metal-enriched gas in galaxyhalos over cosmic time. Previous studies suggest that the CIV cosmic massdensity ($\Omega_{\rm CIV}$) decreases slowly over 1.5 $\lesssim z\lesssim$ 5before declining rapidly at $z\gtrsim$ 5, but the cause of this downturn ispoorly understood. We characterize the $\Omega_{\rm CIV}$ evolution over 4.3$\lesssim z\lesssim$ 6.3 using 260 absorbers found in 42 XSHOOTER spectra of$z\sim$ 6 quasars, of which 30 come from the ESO Large Program XQR-30. Thelarge sample enables us to robustly constrain the rate and timing of thedownturn. We find that $\Omega_{\rm CIV}$ decreases by a factor of 4.8 $\pm$2.0 over the ~300 Myr interval between $z\sim$ 4.7 and $z\sim$ 5.8. The slopeof the column density (log N) distribution function does not change, suggestingthat CIV absorption is suppressed approximately uniformly across 13.2 $\leq$log N/cm$^{-2}$ < 15.0. Assuming that the carbon content of galaxy halosevolves as the integral of the cosmic star formation rate density (with somedelay due to stellar lifetimes and outflow travel times), we show that chemicalevolution alone could plausibly explain the fast decline in $\Omega_{\rm CIV}$over 4.3 $\lesssim z\lesssim$ 6.3. However, the CIV/CII ratio decreases at thehighest redshifts, so the accelerated decline in $\Omega_{\rm CIV}$ at$z\gtrsim$ 5 may be more naturally explained by rapid changes in the gasionization state driven by evolution of the UV background towards the end ofhydrogen reionization.|

In [5]:
# select only papers with matching author names and highlight authors
hl_list = [k[0] for k in lst]

hl_authors = highlight_authors_in_list(paper['authors'], hl_list, verbose=True)
matches = [(hl, orig) for hl, orig in zip(hl_authors, paper['authors']) if 'mark' in hl]
if not matches:
    warnings.warn(AffiliationWarning("WARNING: This paper does not seem to have MPIA authors."))
paper['authors'] = hl_authors
paper


|||
|---:|:---|
| [![arXiv](https://img.shields.io/badge/arXiv-2303.02816-b31b1b.svg)](https://arxiv.org/abs/2303.02816) | **Examining the Decline in the C IV Content of the Universe over 4.3 < z < 6.3 using the E-XQR-30 Sample**  |
|| <mark>Rebecca L. Davies</mark>, et al. -- incl., <mark>Sarah E. I. Bosman</mark>, <mark>Romain A. Meyer</mark>, <mark>Frederick B. Davies</mark> |
|*Appeared on*| *2023-03-06*|
|*Comments*| *20 pages, 8 figures, 4 tables. Published in MNRAS*|
|**Abstract**| Intervening CIV absorbers are key tracers of metal-enriched gas in galaxyhalos over cosmic time. Previous studies suggest that the CIV cosmic massdensity ($\Omega_{\rm CIV}$) decreases slowly over 1.5 $\lesssim z\lesssim$ 5before declining rapidly at $z\gtrsim$ 5, but the cause of this downturn ispoorly understood. We characterize the $\Omega_{\rm CIV}$ evolution over 4.3$\lesssim z\lesssim$ 6.3 using 260 absorbers found in 42 XSHOOTER spectra of$z\sim$ 6 quasars, of which 30 come from the ESO Large Program XQR-30. Thelarge sample enables us to robustly constrain the rate and timing of thedownturn. We find that $\Omega_{\rm CIV}$ decreases by a factor of 4.8 $\pm$2.0 over the ~300 Myr interval between $z\sim$ 4.7 and $z\sim$ 5.8. The slopeof the column density (log N) distribution function does not change, suggestingthat CIV absorption is suppressed approximately uniformly across 13.2 $\leq$log N/cm$^{-2}$ < 15.0. Assuming that the carbon content of galaxy halosevolves as the integral of the cosmic star formation rate density (with somedelay due to stellar lifetimes and outflow travel times), we show that chemicalevolution alone could plausibly explain the fast decline in $\Omega_{\rm CIV}$over 4.3 $\lesssim z\lesssim$ 6.3. However, the CIV/CII ratio decreases at thehighest redshifts, so the accelerated decline in $\Omega_{\rm CIV}$ at$z\gtrsim$ 5 may be more naturally explained by rapid changes in the gasionization state driven by evolution of the UV background towards the end ofhydrogen reionization.|

We get the (TeX) source
* retrieve the tarball
* find the main tex file and parse it
* parse for affiliations (but debugging so we do not stop if not found)
* generate the the output markdown

In [6]:
paper_id = f'{which:s}'
folder = f'tmp_{paper_id:s}'

if not os.path.isdir(folder):
    folder = retrieve_document_source(f"{paper_id}", f'tmp_{paper_id}')

try:
    doc = latex.LatexDocument(folder, validation=validation)    
except AffiliationError as affilerror:
    msg = f"ArXiv:{paper_id:s} is not an MPIA paper... " + str(affilerror)
    print(msg)

# Hack because sometimes author parsing does not work well
if (len(doc.authors) != len(paper['authors'])):
    doc._authors = paper['authors']
if (doc.abstract) in (None, ''):
    doc._abstract = paper['abstract']

doc.comment = get_markdown_badge(paper_id) + " _" + paper['comments'] + "_"
doc.highlight_authors_in_list(hl_list, verbose=True)

full_md = doc.generate_markdown_text()



✔ → 0:header
  ↳ 5777:\section{Introduction}
✔ → 5777:\section{Introduction}
  ↳ 14249:\section{Sample and Data Processing}\label{sec:sample}
✔ → 14249:\section{Sample and Data Processing}\label{sec:sample}
  ↳ 35264:\section{C~IV Line Statistics}\label{sec:results}
✘ → 35264:\section{C~IV Line Statistics}\label{sec:results}
  ↳ 69546:\section{Discussion}\label{sec:discussion}
✔ → 69546:\section{Discussion}\label{sec:discussion}
  ↳ 106506:\section{Summary}\label{sec:conclusions}
✔ → 106506:\section{Summary}\label{sec:conclusions}
  ↳ 115467:end


In [7]:
Markdown(full_md)

<div class="macros" style="visibility:hidden;">
$\newcommand{\ensuremath}{}$
$\newcommand{\xspace}{}$
$\newcommand{\object}[1]{\texttt{#1}}$
$\newcommand{\farcs}{{.}''}$
$\newcommand{\farcm}{{.}'}$
$\newcommand{\arcsec}{''}$
$\newcommand{\arcmin}{'}$
$\newcommand{\ion}[2]{#1#2}$
$\newcommand{\textsc}[1]{\textrm{#1}}$
$\newcommand{\hl}[1]{\textrm{#1}}$
$\newcommand{\OmCIV}{\mbox{\Omega_{\rm C   \textsc{iv}}}}$
$\newcommand{\OmCII}{\mbox{\Omega_{\rm C   \textsc{ii}}}}$
$\newcommand{\HII}{\mbox{H \textsc{ii}}}$
$\newcommand{\CI}{\mbox{C \textsc{i}}}$
$\newcommand{\CII}{\mbox{C \textsc{ii}}}$
$\newcommand{\CIII}{\mbox{C \textsc{iii}}}$
$\newcommand{\CIV}{\mbox{C \textsc{iv}}}$
$\newcommand{\CV}{\mbox{C \textsc{v}}}$
$\newcommand{\SiII}{\mbox{Si \textsc{ii}}}$
$\newcommand{\SiIV}{\mbox{Si \textsc{iv}}}$
$\newcommand{\NV}{\mbox{N \textsc{v}}}$
$\newcommand{\HeII}{\mbox{He \textsc{ii}}}$
$\newcommand{\FeII}{\mbox{Fe \textsc{ii}}}$
$\newcommand{\MgII}{\mbox{Mg \textsc{ii}}}$
$\newcommand{\OI}{\mbox{O \textsc{i}}}$
$\newcommand{\Lya}{Ly\alpha}$
$\newcommand{\kms}{ km s^{-1}}$
$\newcommand{\dndx}{dn/dX}$
$\newcommand{\bibtex}{\textsc{Bib}\!\TeX}$
$\newcommand{\appropto}{\mathrel{\vcenter{$
$  \offinterlineskip\halign{\hfil##\cr$
$    \propto\cr\noalign{\kern2pt}\sim\cr\noalign{\kern-2pt}}}}}$</div>

<div class="macros" style="visibility:hidden;">
$\newcommand{\ensuremath}{}$
$\newcommand{\xspace}{}$
$\newcommand{\object}[1]{\texttt{#1}}$
$\newcommand{\farcs}{{.}''}$
$\newcommand{\farcm}{{.}'}$
$\newcommand{\arcsec}{''}$
$\newcommand{\arcmin}{'}$
$\newcommand{\ion}[2]{#1#2}$
$\newcommand{\textsc}[1]{\textrm{#1}}$
$\newcommand{\hl}[1]{\textrm{#1}}$
$\newcommand{\OmCIV}{\mbox{\Omega_{\rm C   \textsc{iv}}}}$
$\newcommand{\OmCII}{\mbox{\Omega_{\rm C   \textsc{ii}}}}$
$\newcommand{\HII}{\mbox{H \textsc{ii}}}$
$\newcommand{\CI}{\mbox{C \textsc{i}}}$
$\newcommand{\CII}{\mbox{C \textsc{ii}}}$
$\newcommand{\CIII}{\mbox{C \textsc{iii}}}$
$\newcommand{\CIV}{\mbox{C \textsc{iv}}}$
$\newcommand{\CV}{\mbox{C \textsc{v}}}$
$\newcommand{\SiII}{\mbox{Si \textsc{ii}}}$
$\newcommand{\SiIV}{\mbox{Si \textsc{iv}}}$
$\newcommand{\NV}{\mbox{N \textsc{v}}}$
$\newcommand{\HeII}{\mbox{He \textsc{ii}}}$
$\newcommand{\FeII}{\mbox{Fe \textsc{ii}}}$
$\newcommand{\MgII}{\mbox{Mg \textsc{ii}}}$
$\newcommand{\OI}{\mbox{O \textsc{i}}}$
$\newcommand{\Lya}{Ly\alpha}$
$\newcommand{\kms}{ km s^{-1}}$
$\newcommand{\dndx}{dn/dX}$
$\newcommand{\bibtex}{\textsc{Bib}\!\TeX}$
$\newcommand{\appropto}{\mathrel{\vcenter{$
$  \offinterlineskip\halign{\hfil##\cr$
$    \propto\cr\noalign{\kern2pt}\sim\cr\noalign{\kern-2pt}}}}}$</div>



<div id="title">

# Examining the Decline in the C IV Content of the Universe over $\mbox{4.3 $\lesssim z \lesssim$ 6.3}$ using the E-XQR-30 Sample

</div>
<div id="comments">

[![arXiv](https://img.shields.io/badge/arXiv-2303.02816-b31b1b.svg)](https://arxiv.org/abs/2303.02816) _20 pages, 8 figures, 4 tables. Published in MNRAS_

</div>
<div id="authors">

<mark><mark>Rebecca L. Davies</mark></mark>, et al. -- incl., <mark><mark>Sarah E. I. Bosman</mark></mark>, <mark><mark>Romain A. Meyer</mark></mark>, <mark><mark>Frederick B. Davies</mark></mark>

</div>
<div id="abstract">

**Abstract:** Intervening $\CIV$ absorbers are key tracers of metal-enriched gas in galaxy halos over cosmic time. Previous studies suggest that the $\CIV$ cosmic mass density ( $\OmCIV$ ) decreases slowly over $\mbox{1.5 $\lesssim z\lesssim$ 5}$ before declining rapidly at $\mbox{$z\gtrsim$ 5}$ , but the cause of this downturn is poorly understood. We characterize the $\OmCIV$ evolution over $\mbox{4.3 $\lesssim z\lesssim$ 6.3}$ using 260 absorbers found in 42 XSHOOTER spectra of $z\sim$ 6 quasars, of which 30 come from the ESO Large Program XQR-30. The large sample enables us to robustly constrain the rate and timing of the downturn. We find that $\OmCIV$ decreases by a factor of 4.8 $\pm$ 2.0 over the $\mbox{$\sim$ 300 Myr}$ interval between $z\sim$ 4.7 and $z\sim$ 5.8. The slope of the column density ( $\log N$ ) distribution function does not change, suggesting that $\CIV$ absorption is suppressed approximately uniformly across $\mbox{13.2 $\leq\log N$/cm$^{-2}$ $<$ 15.0}$ . Assuming that the carbon content of galaxy halos evolves as the integral of the cosmic star formation rate density (with some delay due to stellar lifetimes and outflow travel times), we show that chemical evolution alone could plausibly explain the fast decline in $\OmCIV$ over $\mbox{4.3 $\lesssim z\lesssim$ 6.3}$ . However, the $\CIV$ / $\CII$ ratio decreases at the highest redshifts, so the accelerated decline in $\OmCIV$ at $z\gtrsim$ 5 may be more naturally explained by rapid changes in the gas ionization state driven by evolution of the UV background towards the end of hydrogen reionization.

</div>

<div id="div_fig1">

<img src="tmp_2303.02816/./CIV_sightlines.png" alt="Fig4.1" width="50%"/><img src="tmp_2303.02816/./completeness_CIV.png" alt="Fig4.2" width="50%"/>

**Figure 4. -** Left: Illustration of the $\CI$V absorber sample. Horizontal lines show the redshift intervals over which the search for $\CI$V absorbers was conducted for each of the 42 quasar sightlines. Grey dashed regions highlight proximity zones within 10,000 $\kms$ of the quasar redshift. Longer gaps trace redshift intervals where the $\CI$V lines fall within BAL features or regions that were masked due to strong skyline or telluric contamination. The markers show all $\CI$V absorbers in the catalog, where solid circles indicate primary absorbers (those that were automatically identified, pass the visual inspection check, and do not fall in masked wavelength regions or BAL regions) and open squares indicate secondary absorbers (all others). Proximate absorbers are shown in black. The intervening absorbers are split into two redshift bins (indicated by the marker color), divided at the path-length-weighted mean redshift of our survey. Right: Completeness as a function of column density for $\CI$V absorbers in the two redshift intervals considered in \citetalias{Davies22Survey}. The dashed lines show the best-fit arctan functions published in that work. (*fig:completeness*)

</div>
<div id="div_fig2">

<img src="tmp_2303.02816/./C_ion_fractions_13.2_15.0.png" alt="Fig2" width="100%"/>

**Figure 2. -** Bar graph illustrating the contribution of $\CI$V and $\CI$I to the cosmic mass density of carbon in each redshift bin. Vertical lines illustrate the measurement errors and are centered on the path-length-weighted mean redshift of each bin with small offsets added for clarity. We are unable to measure $\OmCII$ in the two lowest redshift bins due to the saturation of the $\Lya$ forest. The evolving balance between $\OmCII$ and $\OmCIV$ across the two highest redshift bins suggests that changes in the UV background driven by hydrogen reionization may contribute to the decline in $\OmCIV$. (*fig:carbon_ion_fractions*)

</div>
<div id="div_fig3">

<img src="tmp_2303.02816/./c_ion_modelling.png" alt="Fig3" width="100%"/>

**Figure 3. -** Illustration of the possible ranges of gas densities probed by different ionization states of carbon at $z$ = 5 and $z$ = 6. The curves are the outputs of cloudy photoionization models. We fix the gas temperature to 10$^4$ K and adopt the 2011 update of the \citet{FaucherGiguere09} ionizing spectrum with a density-dependent correction for gas self-shielding as described in \citet{Keating16}. The ionizing spectrum is re-scaled at each redshift to match the H i photoionization rates measured by \citet{Calverley11}. (*fig:cloudy_modelling*)

</div>