Skip to content

Commit

Permalink
Issue data-8#163: Documentation added for method stack in transformation
Browse files Browse the repository at this point in the history
  • Loading branch information
SambhaviPD committed Oct 7, 2020
1 parent e9fdc81 commit 7764e40
Showing 1 changed file with 122 additions and 108 deletions.
230 changes: 122 additions & 108 deletions datascience/tables.py
Original file line number Diff line number Diff line change
Expand Up @@ -1810,133 +1810,147 @@ def pivot_bin(self, pivot_columns, value_column, bins=None, **vargs) :
return binned

def stack(self, key, labels=None):
""" Stacks rows from a table based on key and labels
Args: key and an optional labels. key will be used as
a key for each row, based on which other 2 columns
will be populated
Returns: A table with 3 columns where the first row
is the passed argument "key" and the other two columns
are all column names of the table and it's associated values,
each column value combination as a row against that key.
If labels is also passed, then last 2 columns have
only that label's associated column and it's value.
Few examples,
Example 1:
>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>> players.stack(key='player_id')
gives the following output,
player_id | column | value
110234 | wOBA | 0.354
110235 | wOBA | 0.236
""" Stacks rows from a table based on key and labels
whereas if we pass a different key for the same table,
we get a different combination.
Args:
``key`` : Input argument key will be used as a key for
each row, based on which other 2 columns will be populated
>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>> players.stack(key='wOBA')
gives the following output,
Kwargs:
``labels``: default None, if value is passed, return table has
values corresponding to passed label value only and not all
wOBA | column | value
0.354 | player_id | 110234
0.236 | player_id | 110235
Example 2:
>> jobs = Table().with_columns( \
'job', make_array('a', 'b', 'c', 'd'),
'wage', make_array(10, 20, 15, 8))
>> jobs.stack(key='wage')
Returns:
A table with 3 columns where the first row
is the passed argument "key" and the other two columns
are all column names of the table and it's associated values,
each column value combination as a row against that key.
If labels is also passed, then last 2 columns have
only that label's associated column and it's value.
gives the following output,
Few examples,
wage | column | value
10 | job | a
20 | job | b
15 | job | c
8 | job | d
Example 1:
As in previous example, let's change the key.
>>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>>> players.stack(key='player_id')
gives the following output,
player_id | column | value
110234 | wOBA | 0.354
110235 | wOBA | 0.236
>> jobs = Table().with_columns( \
'job', make_array('a', 'b', 'c', 'd'),
'wage', make_array(10, 20, 15, 8))
>> jobs.stack(key='job')
whereas if we pass a different key for the same table,
we get a different combination.
gives the following output,
>>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>>> players.stack(key='wOBA')
gives the following output,
job | column | value
a | wage | 10
b | wage | 20
c | wage | 15
d | wage | 8
wOBA | column | value
0.354 | player_id | 110234
0.236 | player_id | 110235
Example 3:
Example 2:
Let's take another Table with different data set
>>> jobs = Table().with_columns( \
'job', make_array('a', 'b', 'c', 'd'), \
'wage', make_array(10, 20, 15, 8))
>>> jobs.stack(key='wage')
gives the following output,
wage | column | value
10 | job | a
20 | job | b
15 | job | c
8 | job | d
As in previous example, let's change the key
>>> jobs = Table().with_columns( \
'job', make_array('a', 'b', 'c', 'd'), \
'wage', make_array(10, 20, 15, 8))
>>> jobs.stack(key='job')
gives the following output,
job | column | value
a | wage | 10
b | wage | 20
c | wage | 15
d | wage | 8
Example 3:
Let's take one more table, but with 3 columns rather than 2 as
in previous examples
>>> table = Table().with_columns( \
'days', make_array(0, 1, 2, 3, 4, 5), \
'price', make_array(90.5, 90.00, 83.00, 95.50, 82.00, 82.00), \
'projection', make_array(90.75, 82.00, 82.50, 82.50, 83.00, 82.50))
>>> table.stack(key='price')
gives the following output,
price | column | value
90.5 | days | 0
90.5 | projection | 90.75
90 | days | 1
90 | projection | 82
83 | days | 2
83 | projection | 82.5
95.5 | days | 3
95.5 | projection | 82.5
82 | days | 4
82 | projection | 83
(2 rows omitted)
Example 4:
If we specify a particular label, we then get that label related values only
>> table = Table().with_columns( \
'days', make_array(0, 1, 2, 3, 4, 5), \
'price', make_array(90.5, 90.00, 83.00, 95.50, 82.00, 82.00), \
'projection', make_array(90.75, 82.00, 82.50, 82.50, 83.00, 82.50))
>> table.stack(key='price')
>>> table.stack(key='price', labels="days")
price | column | value
90.5 | days | 0
90 | days | 1
83 | days | 2
95.5 | days | 3
82 | days | 4
82 | days | 5
gives the following output,
Example 5:
price | column | value
90.5 | days | 0
90.5 | projection | 90.75
90 | days | 1
90 | projection | 82
83 | days | 2
83 | projection | 82.5
95.5 | days | 3
95.5 | projection | 82.5
82 | days | 4
82 | projection | 83
(2 rows omitted)
If we give a non-existent key, we get an Attribute Error
Example 4:
>>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>>> players.stack(key='abc')
If we specify a particular label, we then get that label related values only.
>> table.stack(key='price', labels="days")
price | column | value
90.5 | days | 0
90 | days | 1
83 | days | 2
95.5 | days | 3
82 | days | 4
82 | days | 5
Example 5:
gives the following output,
If we give a non-existent key, we get an Attribute Error
>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>> players.stack(key='abc')
AttributeError: Attribute (abc) not found in row.
gives the following output,
Example 6:
AttributeError: Attribute (abc) not found in row.
Example 6:
If we give a non-existent label, we get an empty table without any errors.
>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>> players.stack(key="wOBA", labels="abc")
If we give a non-existent label, we get an empty table without any errors
>>> players = Table().with_columns('player_id', \
make_array(110234, 110235), 'wOBA', make_array(.354, .236))
>>> players.stack(key="wOBA", labels="abc")
gives the following output,
gives the following output,
wOBA | column | value """

wOBA | column | value """
rows, labels = [], labels or self.labels
for row in self.rows:
[rows.append((getattr(row, key), k, v)) for k, v in row.asdict().items()
Expand Down Expand Up @@ -5654,4 +5668,4 @@ def __getitem__(self, row_indices_or_slice):

# For Sphinx: grab the docstrings from `Taker.__getitem__` and `Withouter.__getitem__`
Table.take.__doc__ = _RowTaker.__getitem__.__doc__
Table.exclude.__doc__ = _RowExcluder.__getitem__.__doc__
Table.exclude.__doc__ = _RowExcluder.__getitem__.__doc__

0 comments on commit 7764e40

Please sign in to comment.