<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Reading-ascii-with-Python-built-in-functions" data-toc-modified-id="Reading-ascii-with-Python-built-in-functions-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Reading ascii with <code>Python</code> built-in functions</a></span></li><li><span><a href="#Reading-ascii-with-numpy" data-toc-modified-id="Reading-ascii-with-numpy-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Reading ascii with <code>numpy</code></a></span><ul class="toc-item"><li><span><a href="#np.genfromtxt" data-toc-modified-id="np.genfromtxt-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span><code>np.genfromtxt</code></a></span></li></ul></li></ul></div>

**@juliaroquette: In this notebook I will show different ways on how to read and write files. **

first let's define a working directory:

In [1]:
mydir='/Users/juliaroquette/Science/Python4DummiesIPAG/'


## Reading ascii with `Python` built-in functions

My first example table is an ascii table, called example1.txt

using only Python built-in functions, we could use open and read, for example:


In [23]:
testfile1='example1.txt'
file = open(mydir+testfile1, 'r')
t=file.read()
file.close()
print(t)

with open(testfile1,'r') as file:
    t = file.readlines()
    print(t)

2456784.471990   18.944695    0.027383 -0.242  3.74 1.27821 -0.55629 1937.508    2.679 0
2456784.476364   18.976622    0.025373 -0.132  4.03 1.26868 -0.52874 1938.049    2.581 0
2456784.480708   18.594748    0.018693 -0.154  4.82 1.25978 -0.50139 1938.174    2.718 0
2456784.484940   18.471004    0.021103 -0.349  4.70 1.25163 -0.47474 1938.104    2.676 0
2456784.489297   18.363495    0.016802 -0.233  5.69 1.24378 -0.44730 1938.672    2.880 0
2456784.493522   18.443390    0.018161 -0.243  4.81 1.23667 -0.42070 1938.370    2.888 0
2456784.497886   18.268200    0.017850 -0.364  5.99 1.22983 -0.39321 1938.807    3.025 0
2456784.502093   18.148296    0.015756 -0.347  5.34 1.22371 -0.36671 1938.601    3.116 0
2456784.506479   18.027699    0.014458 -0.375  6.26 1.21783 -0.33910 1938.912    3.364 0
2456784.510678   18.004862    0.012513 -0.267  6.01 1.21267 -0.31265 1938.879    3.339 0
2456784.523347 -999.000000 -999.000000 -0.102  3.59 1.19974 -0.23287 1938.786    2.639 1
2456784.527546 -999.0

In [157]:
print(len(t),type(t))

35432 <class 'str'>


Which reads everything inside the file as a string unless we specify the contrary someway.

## Reading ascii with `numpy`

But let's not lose time with that, as there are more robust ways of reading it with numpy and other third-party packages. For example with np.loadtxt or np.gentromtxt. I prefer the second. 

In [3]:
import numpy as np

### `np.genfromtxt`

In [7]:
t=np.genfromtxt(mydir+testfile1)

In [5]:
t.shape

(402, 10)

In [160]:
type(t)

numpy.ndarray

In [161]:
t=t.transpose() #transpose the matrix
t.shape

(10, 402)

In [162]:
print(t[1,0:20]) #print the first 20 elements of column 1

[  18.944695   18.976622   18.594748   18.471004   18.363495   18.44339
   18.2682     18.148296   18.027699   18.004862 -999.       -999.
 -999.       -999.       -999.       -999.       -999.       -999.
 -999.       -999.      ]


I can chose to read only a couple of columns, for example the 3 first ones, and also if you took a look in the table, you might have seen that missing data appears as -999.0000, and we can deal with both things

In [6]:
t=np.genfromtxt(mydir+testfile1,usecols=(0,1,2),missing_values='-999.000000',usemask=True)

Using usemask=True means that you also read a boolean mask informing you about the places with missing data. You can acess this mask as

In [164]:
t.mask

array([[False, False, False],
       [False, False, False],
       [False, False, False],
       ...,
       [False, False, False],
       [False, False, False],
       [False, False, False]])

In [165]:
t=t.transpose()
t.shape

(3, 402)

the effect of masking is something like:

In [166]:
print(t[1,0:20])

[18.944695 18.976622 18.594748 18.471004 18.363495 18.44339 18.2682
 18.148296 18.027699 18.004862 -- -- -- -- -- -- -- -- -- --]


In [167]:
t=t.filled(np.nan)
print(t[1,0:20])

[18.944695 18.976622 18.594748 18.471004 18.363495 18.44339  18.2682
 18.148296 18.027699 18.004862       nan       nan       nan       nan
       nan       nan       nan       nan       nan       nan]


The second example is with .fits files. For reading them you will need the fits module from astropy. 

In [10]:
testfile='example3.fits' #name the file to be read

In [11]:
import astropy.io.fits as pf #load module


To open the fits itself:

In [12]:
hdu=pf.open(mydir+testfile)

In [13]:
print(hdu)


[<astropy.io.fits.hdu.image.PrimaryHDU object at 0x1512afae10>, <astropy.io.fits.hdu.table.BinTableHDU object at 0x1512b4fa20>]


So, this file has two data units (hdu), a primary one, and a secondary one. You can get more info using:

In [14]:
hdu.info()


Filename: /Users/juliaroquette/Science/Python4DummiesIPAG/example3.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU      16   (997345,)   uint8   
  1  /Users/bouvijer/Documents/Reunions_diverses/Projects/Monitor/NGC3...    1 BinTableHDU     33   211R x 7C   [E, E, 402D, 402E, 402E, D, D]   


To see the header of the primary unit:

In [15]:
hdu[0].header


SIMPLE  =                    T / Standard FITS format                           
BITPIX  =                    8 / Character data                                 
NAXIS   =                    1 / Text string                                    
NAXIS1  =               997345 / Number of characters                           
VOTMETA =                    T / Table metadata in VOTable format               
EXTEND  =                    T / There are standard extensions                  
COMMENT                                                                         
COMMENT The data in this primary HDU consists of bytes which                    
COMMENT comprise a VOTABLE document.                                            
COMMENT The VOTable describes the metadata of the table contained               
COMMENT in the following BINTABLE extension.                                    
COMMENT Such a BINTABLE extension can be used on its own as a perfectly         
COMMENT good table, but the 

To see the header of the secondary unit:

In [16]:
hdu[1].header


XTENSION= 'BINTABLE'           / binary table extension                         
BITPIX  =                    8 / 8-bit bytes                                    
NAXIS   =                    2 / 2-dimensional table                            
NAXIS1  =                 6456 / width of table in bytes                        
NAXIS2  =                  211 / number of rows in table                        
PCOUNT  =                    0 / size of special data area                      
GCOUNT  =                    1 / one data group                                 
TFIELDS =                    7 / number of columns                              
EXTNAME = '/Users/bouvijer/Documents/Reunions_diverses/Projects/Monitor/NGC3...'
TTYPE1  = 'medflux '           / label for column 1                             
TFORM1  = 'E       '           / format for column 1                            
TUNIT1  = 'mag     '           / units for column 1                             
TTYPE2  = 'rms     '        

To see the content of the columns:

In [17]:
hdu[1].columns



ColDefs(
    name = 'medflux'; format = 'E'; unit = 'mag'
    name = 'rms'; format = 'E'; unit = 'mag'
    name = 'hjd'; format = '402D'; unit = 'days'
    name = 'flux'; format = '402E'; unit = 'mag'
    name = 'fluxerr'; format = '402E'; unit = 'mag'
    name = 'ra'; format = 'D'; unit = 'radians'
    name = 'dec'; format = 'D'; unit = 'radians'
)

To the get the data in one of the columns:

In [18]:
medflux=hdu[1].data["medflux"]

In [19]:
type(medflux)

numpy.ndarray

In [20]:
medflux.shape

(211,)

In [21]:
print(medflux.max(),medflux.min())

22.6865 12.549112


You can save the content of the headers using built-in python functions:

In [22]:
f=open(mydir+'saveheader.txt','w')
f.write(repr(hdu[1].header))
f.close()

Of course, you shall close your fits file at the end

In [213]:
hdu.close()

The example with the header above show how to write files using the built-in functions. As in the case of reading tables with such functions, there are more robust options available. One of these is the numpy savetext. For an example, let's create some random data:

In [25]:
ID=np.arange(1,201,dtype=int)
d1=np.random.uniform(15.,25.,200)
d2=np.random.uniform(0.01,0.15,200)

Go take a look on numpy.savetxt documentation in https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.savetxt.html, but you can save it as ascii simply as.

In [216]:
np.savetxt(mydir+'mytxt.txt', np.vstack((ID,d1,d2)).T,delimiter="\t",fmt='%4i,%10.5f,%10.5f',header='#ID data1 data2')


Numpy also have a simple file format for saving its arrays, the .npy format. This option is very usefull when you want to quickly restore arrays in another code. You can simply use:

In [26]:
np.save(mydir+'mysave',[ID,d1,d2])

To open it back, use:

In [27]:
t=np.load(mydir+'mysave.npy')


In [28]:
t.shape


(3, 200)

or

In [29]:
ID,d1,d2=np.load(mydir+'mysave.npy')


Finally, here is how you can create a fits table with your data:

In [222]:
from astropy.table import Table
from astropy.io import fits

First, create the table itself:

In [223]:
table=Table()
table["ID"]=ID
table["data1"]=d1
table["data2"]=d2

next, fill the information on the columns:

In [224]:
col=[]
for j in range(3):
    col.append(fits.Column(name=table.colnames[j], format='E', array=table[table.colnames[j]]))


Next, create on or more data units

In [225]:
hdu=[]
hdu.append(fits.BinTableHDU.from_columns(fits.ColDefs(col)))


A header:

In [226]:
hdr=fits.Header()
hdr['Author']='Julia Roquette'
hdr['dataset']='some information'
hdr["Comment"]="some other information"

Create the primary unity:

In [228]:
phdu=fits.PrimaryHDU(header=hdr)


Put all of thist together:

In [229]:
hdul=fits.HDUList([phdu,hdu[0]])

and save it :)

In [230]:
hdul.writeto(mydir+'myfits.fits')