# Basic usage of the `chaotic_neutral` package

This example notebook shows how to use the basic `astro-ph-GA-23May2021` model trained on the ~30k most recent ArXiv astro-ph.GA papers up to May 23, 2021. 

Users can query the model for similar papers using either 
- `input_type`: `arxiv_id` or `keywords` for the `doc_id` field. In the latter case, input a list of keyword strings.
- `return_n`: controls how many results to return.

Additional arguments include:
- `show_authors` (default = False): set to True to show author list
- `show_summary` (default = False): set to True to show a 1-2 sentence abstract summary generated using the `summa` package.

In [1]:
import chaotic_neural as cn

If you are running this notebook on your machine from a cloned version of the repository and are using the pre-trained model that comes with chaotic_neutral, please make sure to 
1. either run this notebook in the same /docs/tutorial directory you find it in, or 
2. change the directory `cn_dir` in the cell below to match the directory that chaotic_neural is installed in

In [2]:
model_data = cn.load_trained_doc2vec_model('astro-ph-GA-23May2021', cn_dir = '../../chaotic_neural/')
model, all_titles, all_abstracts, all_authors, train_corpus, test_corpus = model_data

In [3]:
sims = cn.list_similar_papers(model_data, doc_id = 1903.10457, input_type='arxiv_id')

ArXiv id:  1903.10457
Title: Learning the Relationship between Galaxies Spectra and their Star
  Formation Histories using Convolutional Neural Networks and Cosmological
  Simulations
-----------------------------
Most similar/relevant papers: 
-----------------------------
0 Learning the Relationship between Galaxies Spectra and their Star
  Formation Histories using Convolutional Neural Networks and Cosmological
  Simulations  (Corrcoef: 0.99 )
1 MCSED: A flexible spectral energy distribution fitting code and its
  application to $z \sim 2$ emission-line galaxies  (Corrcoef: 0.73 )
2 Augmenting machine learning photometric redshifts with Gaussian mixture
  models  (Corrcoef: 0.73 )
3 MAGPHYS+photo-z: Constraining the Physical Properties of Galaxies with
  Unknown Redshifts  (Corrcoef: 0.72 )
4 MOSFIRE Spectroscopy of Quiescent Galaxies at 1.5 < z < 2.5. II - Star
  Formation Histories and Galaxy Quenching  (Corrcoef: 0.71 )
5 Stellar Populations of over one thousand $z\sim0.8$ Galaxi

In [4]:
sims = cn.list_similar_papers(model_data, doc_id = ['quenching','galaxy'], 
                           input_type='keywords', 
                           return_n=10, show_authors = True, show_summary=True)

Keyword(s):  ['quenching', 'galaxy']
multi-keyword
-----------------------------
Most similar/relevant papers: 
-----------------------------
0 The cumulative star-formation histories of dwarf galaxies with TNG50. I:
  Environment-driven diversity and connection to quenching  (Corrcoef: 0.58 )
Authors:------
[{'name': 'Gandhali D. Joshi'}, {'name': 'Annalisa Pillepich'}, {'name': 'Dylan Nelson'}, {'name': 'Elad Zinger'}, {'name': 'Federico Marinacci'}, {'name': 'Volker Springel'}, {'name': 'Mark Vogelsberger'}, {'name': 'Lars Hernquist'}]
Summary:------
The key factors determining the dwarfs' SFHs are their status as central or satellite and their stellar mass, with centrals and more massive dwarfs assembling their stellar mass at later times on average compared to satellites and lower mass dwarfs.
TNG50 predicts a large diversity in SFHs for both centrals and satellites, so that the stacked cumulative SFHs are representative of the TNG50 dwarf populations only in an average sense and 