# The Virgo Database

We are going to extract data from the [Virgo Database](http://virgo.dur.ac.uk/data.php), in particular from the [millimil](http://galaxy-catalogue.dur.ac.uk:8080/Millennium/) for which is not necessary to register. The millimil has been run in a cubic box of comovin side of $62.5 {\rm Mpc}/h$ with dark matter particles with a minnimum mass of $0.86*10^9 {\rm M}_{\odot}/h$. The assumed cosmological parameters in this simulation are: $\Omega_0 = 0.25$, $\Lambda _0 = 0.75$, $\Omega _b = 0.045$ and $h_0 = 0.73$, as for the [Millennium Simulation](https://arxiv.org/abs/astro-ph/0504097).


### Q.1 What kind of data can  you find in the [Virgo Database](http://virgo.dur.ac.uk/data.php)?

The information in Databases is sorted in tables. The data from a given table can be retrived using SQL. A good tutorial on SQL can be found [here](https://www.codeschool.com/courses/try-sql). The Virgo Database has a SQL query form, where queries can be directly typed:
<img src="images/queryform.png">

Some demo queries are provided in the Database and [McAlpine et al. 2016](https://arxiv.org/abs/1510.01320) contains a lot of good examples.

This is the information about the "Snapshots" table:
<img src="images/snapshots.png">

Try out the following SQL query in the millimil query form, which downloads the redshift and snapshot numbers from the "Snapshots" table:

``` mysql
SELECT redshift, snapnum
FROM snapshots; 
```

The "where" clause is used in SQL to specify a condition. For example, to get all the information in the "Snapshots" table that corresponds to a snapshot number of 19:

``` mysql
SELECT *
FROM snapshots
WHERE snapnum = 19 ; 
```

### Q.2 What is the snapshot number that corresponds to $z=0$?

### Q.3 Explore the table [MPAhalo](http://virgodb.dur.ac.uk:8080/Millennium/Help?page=databases/millimil/mpahalo) and make an SQL to get the number of simulation particles in halos and one measure of their mass at $z=0$

The same queries can be done using [python](https://docs.python.org/3/tutorial/). There are several packages to do so, including the one specifically created for the [Virgo Database](http://virgo.dur.ac.uk/data.php): eagleSqlTools.py. This package requires numpy and needs to be imported in your python program:

In [None]:
import eagleSqlTools as sql

Next a connection needs to be defined, including a username ("xyz"), password ("abc") and link to the [Database](http://virgodb.dur.ac.uk:8080/Millennium): 

In [None]:
con = sql.connect("xyz", "abc", url="http://virgodb.dur.ac.uk:8080/Millennium")

The SQL is defined as a string (spaces do not matter here) and can be run as follows: 

In [None]:
the_query = """SELECT * 
               FROM snapshots 
               WHERE snapnum = 19 ;"""

data = con.execute_query(the_query)
print type(data)

To easily access the outcome from the query, a name can be given to each downloaded value:

In [None]:
the_query = """SELECT redshift as redshift, snapnum as snapnum
               FROM snapshots ;"""

data = con.execute_query(the_query)
redshift = data["redshift"]
snapnum  = data["snapnum"]

print np.min(redshift),np.max(redshift)

The information retrived from the SQL query, 'data', can be stored into a file, 'z_snap.txt', using for example savetxt from numpy:

In [None]:
outfile = 'z_snap.txt'
np.savetxt(outfile,data)
print 'Output file:', outfile


### Q.4 What is the structure of the output file? How is the redshift stored? What is in the header?

Read the redhift from the output file:

In [None]:
import csv


### Q.5 Modify the code above to produce a file with the number of simulation particles in haloes and one measurement of their mass at $z=0$, from the table [MPAhalo](http://virgodb.dur.ac.uk:8080/Millennium/Help?page=databases/millimil/mpahalo).