# Adding columns

In preceding examples, we've used the built-in operator for division against two numeric arguments of type `float`. The result of each of these operations was appropriately also a `float`.

We can also apply this operator to divide each of an entire `Series` of `float` values by a single `float`, and in so doing produce a new `Series`.

Many built-in operators are supported. For clarity, let's map our percent-change values to conventional percentages, using the operator for multiplication.

The `pct_change` method returns these values as ratios, comparing the distance of each planet from the sun, $distance_1$, to the distance of the planet preceding it, $distance_0$, according to:

$$
\frac{distance_1 - distance_0}{distance_0}
$$

We can represent these as conventional percentages by simply multiplying them by `100`.

In [24]:
distance_pct_change = distance_rel_change * 100

distance_pct_change

0           NaN
1     86.873921
2     38.262477
3     52.339572
4    241.641071
5     84.112510
6    100.383676
7     56.487380
Name: solar_distance_km_6, dtype: float64

And while we all have in mind the order of the planets, it would be more meaningful to see the above values in context, along with our other planetary data.

We can use the `assign` method to construct a new `DataFrame`, one which begins with the data from `planets`, and which adds to this our new column.

In [25]:
planets.assign(distance_pct_change=distance_pct_change)

Unnamed: 0,name,solar_distance_km_6,mass_kg_24,density_kg_m3,gravity_m_s2,distance_pct_change
0,Mercury,57.9,0.33,5427.0,3.7,
1,Venus,108.2,4.87,5243.0,8.9,86.873921
2,Earth,149.6,5.97,5514.0,9.8,38.262477
3,Mars,227.9,0.642,3933.0,3.7,52.339572
4,Jupiter,778.6,1898.0,1326.0,23.1,241.641071
5,Saturn,1433.5,568.0,687.0,9.0,84.11251
6,Uranus,2872.5,86.8,1271.0,8.7,100.383676
7,Neptune,4495.1,102.0,1638.0,11.0,56.48738


As you can see, the `assign` method accepts *only* named keyword arguments. Notably, these can be *any* syntactically-valid name – `pandas` adds the sequence of values you specify to the existing data as a new column, and *assigns* to the new column this name.

Arguably, of course, our multiplication by 100 was only an aesthetic change. We might instead preserve the output of our `pct_change` calculation, and merely adjust the presentation of our `DataFrame`.

`pandas` offers the `DataFrame` property `style`, whose `format` method accepts either functions or strings, with which it determines how to present its data. String arguments to this `format` method follow Python's standard form for indicating how a value should be presented as text.

Python's strings offer their own `format` method. And, for example, we might construct a string presenting the `float` value `0.868739` as a conventional percentage, as follows.

In [26]:
'{:.2%}'.format(0.868739)

'86.87%'

In a similar manner we can apply this formatting to our column, without altering the underlying values.

In [27]:
planets_rel_change = planets.assign(distance_rel_change=distance_rel_change)

planets_rel_change

Unnamed: 0,name,solar_distance_km_6,mass_kg_24,density_kg_m3,gravity_m_s2,distance_rel_change
0,Mercury,57.9,0.33,5427.0,3.7,
1,Venus,108.2,4.87,5243.0,8.9,0.868739
2,Earth,149.6,5.97,5514.0,9.8,0.382625
3,Mars,227.9,0.642,3933.0,3.7,0.523396
4,Jupiter,778.6,1898.0,1326.0,23.1,2.416411
5,Saturn,1433.5,568.0,687.0,9.0,0.841125
6,Uranus,2872.5,86.8,1271.0,8.7,1.003837
7,Neptune,4495.1,102.0,1638.0,11.0,0.564874


In [28]:
planets_rel_change.style.format({
    'distance_rel_change': '{:.2%}',
})

Unnamed: 0,name,solar_distance_km_6,mass_kg_24,density_kg_m3,gravity_m_s2,distance_rel_change
0,Mercury,57.9,0.33,5427,3.7,nan%
1,Venus,108.2,4.87,5243,8.9,86.87%
2,Earth,149.6,5.97,5514,9.8,38.26%
3,Mars,227.9,0.642,3933,3.7,52.34%
4,Jupiter,778.6,1898.0,1326,23.1,241.64%
5,Saturn,1433.5,568.0,687,9.0,84.11%
6,Uranus,2872.5,86.8,1271,8.7,100.38%
7,Neptune,4495.1,102.0,1638,11.0,56.49%


Note, in the above, we made use of another collection built into Python: the `dict`, or "dictionary." The syntax of `dict` construction is simply:

    {KEY0: VALUE0, …}

Dictionaries are widely useful; but, here, you'll only see them as another means of associating keyword arguments with their values, analogous to how `assign` is called.

We can also rename columns, in this case also for clarity, and again making use of a `dict`.

In [29]:
planets_rel_change = planets_rel_change.rename(
    columns={
        'distance_rel_change': 'distance relative change',
    }
)

planets_rel_change.style.format({
    'distance relative change': '{:.2%}',
})

Unnamed: 0,name,solar_distance_km_6,mass_kg_24,density_kg_m3,gravity_m_s2,distance relative change
0,Mercury,57.9,0.33,5427,3.7,nan%
1,Venus,108.2,4.87,5243,8.9,86.87%
2,Earth,149.6,5.97,5514,9.8,38.26%
3,Mars,227.9,0.642,3933,3.7,52.34%
4,Jupiter,778.6,1898.0,1326,23.1,241.64%
5,Saturn,1433.5,568.0,687,9.0,84.11%
6,Uranus,2872.5,86.8,1271,8.7,100.38%
7,Neptune,4495.1,102.0,1638,11.0,56.49%


Because we've added spaces to our column name, it's no longer valid in the syntax of Python, and so we can no longer refer to it as we have the other columns, such as `solar_distance_km_6`:

In [30]:
planets_rel_change.solar_distance_km_6

0      57.9
1     108.2
2     149.6
3     227.9
4     778.6
5    1433.5
6    2872.5
7    4495.1
Name: solar_distance_km_6, dtype: float64

But, we can still refer to it using the alternate syntax mentioned above:

In [31]:
planets_rel_change['distance relative change']

0         NaN
1    0.868739
2    0.382625
3    0.523396
4    2.416411
5    0.841125
6    1.003837
7    0.564874
Name: distance relative change, dtype: float64