Skip to content

Table item access definition

taldcroft edited this page Dec 5, 2011 · 3 revisions

Item access definition by example

Examples of item access using astropy.table are shown below. The outputs from this example are shown in detail further below along with commentary about the returned object. Please feel free to edit here directly to comment. :

import numpy as np
from astropy.table import Column, TableColumns, Table
tc = TableColumns(cols=[Column('a'), Column('b'), Column('c')])
tc['a'] 
tc[1] 
tc['a', 'b']
tc[1:3] 

t = Table(np.arange(30).reshape(10,3), names=('a','b','c'))
t.columns 
t['a'] 
t[1]   
t[2:5] 
t[np.array([2,5,7])] 
t['a', 'c']

Define Table Columns :

>>> tc = TableColumns(cols=[Column('a'), Column('b'), Column('c')])

The current implementation does not have a tight coupling between TableColumns and a parent table. There is a table attribute but it isn't actually used for anything (and will be removed unless we identify a use). The original suggestion of having slicing or multiple column access return the corresponding Table seems inconsistent. It feels like slicing a TableColumns object should return another TableColumns object. One can always do Table(tc[2:5]) to make a new table with columns 2:5, and the Table object now supports selecting multiple columns to create a new table.

If there is no real table coupling then calling the class ColumnList makes more sense, except that is implemented as an OrderedDict and so implying it behaves like a list would be confusing. I'm not tied to using OrderedDict, this was inherited from ATPy. In my head the columns in a table are really a list-like entity which you should be able to access by matching an item to the column name (or names in the case of VO). I always want to do for col in table.columns and have col be a Column, not the name of a Column. So maybe move to a plain list as the basis for ColumnList?

Access table column by name : returns one Column :

>>> tc['a'] 
<Column name='a units='None' format='None' description='None'>
array([], dtype=float64)

Access table column by position : returns one Column :

>>> tc[1] 
<Column name='b units='None' format='None' description='None'>
array([], dtype=float64)

Multiple columns : Returns new TableColumns :

>>> tc['a', 'b']
<TableColumns names=('a','b')>

Columns slice : Returns new TableColumns :

>>> tc[1:3] 
<TableColumns names=('b','c')>

Define a Table : 10 rows and 3 columns :

>>> t = Table(np.arange(30).reshape(10,3), names=('a','b','c'))
Get the underlying TableColumns object

>>> t.columns <TableColumns names=('a','b','c')>

The main unique functionality brought by this object is selecting columns by numerical index instead of name. Personally I never find a need to do this, but if others find this useful then it is now there.

Get a Column : returns REF to Table.columns['a'] :

>>> t['a']
<Column name='a units='None' format='None' description='None'>
array([ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27])

Get a Row : returns Row object with REF to table data :

>>> t[1]
(3, 4, 5)

In the current code this just returns self._data[1], i.e. np.void REF to the row values.

Get a Table slice : returns new Table object with REF view of rows 2:5 :

>>> t[2:5]  
<Table rows=3 names=('a','b','c')>
array([(6, 7, 8), (9, 10, 11), (12, 13, 14)], 
      dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])

Currently the code returns a COPY of the rows. I think this should be changed to be a view.

Get a fancy index slice : returns COPY of the rows

>>> t[np.array([2,5,7])] # Table obj with rows 2, 5, 7 (COPY) <Table rows=3 names=('a','b','c')> array([(6, 7, 8), (15, 16, 17), (21, 22, 23)], dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])

This behavior is consistent with NumPy.

Select columns from Table : returns new Table with COPY of selected column data.

>>> t['a', 'c'] # Table with cols 'a', 'c' (COPY) <Table rows=10 names=('a','c')> array([(0, 2), (3, 5), (6, 8), (9, 11), (12, 14), (15, 17), (18, 20), (21, 23), (24, 26), (27, 29)], dtype=[('a', '<i8'), ('c', '<i8')])

Is COPY or REF better here? Probably most users would imagine they are getting a copy when they do this operation, and in some sense it is closer to fancy indexing than slicing.

Clone this wiki locally