# Basic usage of the `chaotic_neutral` package

This example notebook shows how to use the basic `galaxies_all` model trained on almost all ArXiv astro-ph.GA papers up to October 24, 2019. 

Users can query the model for similar papers using either 
- `input_type`: `arxiv_id` or `keywords` for the `doc_id` field. In the latter case, input a list of keyword strings.
- `return_n`: controls how many results to return.

Additional arguments include:
- `show_authors` (default = False): set to True to show author list
- `show_summary` (default = False): set to True to show a 1-2 sentence abstract summary generated using the `summa` package.

In [1]:
import chaotic_neural as cn

If you are running this notebook on your machine from a cloned version of the repository and are using the pre-trained model that comes with chaotic_neutral, please make sure to 
1. either run this notebook in the same /docs/tutorial directory you find it in, or 
2. change the directory `cn_dir` in the cell below to match the directory that chaotic_neural is installed in

In [2]:
model_data = cn.load_trained_doc2vec_model('galaxies_all', cn_dir = '../../chaotic_neural/')
model, all_titles, all_abstracts, all_authors, train_corpus, test_corpus = model_data

In [3]:
sims = cn.list_similar_papers(model_data, doc_id = 1903.10457, input_type='arxiv_id')

ArXiv id:  1903.10457
Title: Learning the Relationship between Galaxies Spectra and their Star
  Formation Histories using Convolutional Neural Networks and Cosmological
  Simulations
-----------------------------
Most similar/relevant papers: 
-----------------------------
0 Learning the Relationship between Galaxies Spectra and their Star
  Formation Histories using Convolutional Neural Networks and Cosmological
  Simulations  (Corrcoef: 0.97 )
1 A Sparse Gaussian Process Framework for Photometric Redshift Estimation  (Corrcoef: 0.71 )
2 Improving Photometric Redshift Estimation using GPz: size information,
  post processing and improved photometry  (Corrcoef: 0.68 )
3 The Spatial Distribution of Satellite Galaxies Within Halos: Measuring
  the Very Small Scale Angular Clustering of SDSS Galaxies  (Corrcoef: 0.68 )
4 Photometric redshifts for the SDSS Data Release 12  (Corrcoef: 0.67 )
5 The Data Analysis Pipeline for the SDSS-IV MaNGA IFU Galaxy Survey:
  Emission-Line Modeling  (Co

In [4]:
sims = cn.list_similar_papers(model_data, doc_id = ['simulation','sed','fitting'], 
                           input_type='keywords', 
                           return_n=3, show_authors = True, show_summary=True)

Keyword(s):  ['simulation', 'sed', 'fitting']
multi-keyword
-----------------------------
Most similar/relevant papers: 
-----------------------------
0 Should we believe the results of UV-mm galaxy SED modelling?  (Corrcoef: 0.53 )
Authors:------
[{'name': 'Christopher C. Hayward'}, {'name': 'Daniel J. B. Smith'}]
Summary:------
We compare the properties inferred from the SED modelling with the true values and find that MAGPHYS recovers most physical parameters of the simulated galaxies well.
 
1 Morphology-assisted galaxy mass-to-light predictions using deep learning  (Corrcoef: 0.47 )
Authors:------
[{'name': 'Wouter Dobbels'}, {'name': 'Serge Krier'}, {'name': 'Stephan Pirson'}, {'name': 'Sébastien Viaene'}, {'name': 'Gert De Geyter'}, {'name': 'Samir Salim'}, {'name': 'Maarten Baes'}]
Summary:------
Spectral energy distribution (SED) fitting can make use of all available fluxes and their errors to make a Bayesian estimate of the M/L.
When we combine the morphology features with gl