## Data Wrangling with Pandas

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('data/media.csv')
df

### Dropping elements in a DataFrame

If we want to get rid of specific elements in a DataFrame, we can use the drop command:

In [None]:
df.drop([0, 1])

As you can see, we passed along a list of elements we want to drop, in this case integer values. This means, we effectively remove the first two rows in the DataFrame. However, if we print the content of the DataFrame again, the following happens:

In [None]:
df

So, the rows are basically still in place. We have two options here, one is, to assign the result of the drop-operation to a new variable:

In [None]:
df_new = df.drop([0, 1])
df_new

That seems to work. The other is, to make use of the __inplace__ parameter:

In [None]:
df.drop([0, 1], inplace=True)
df

Works well - but notice the following:
<div class="alert alert-block alert-warning">
<b>Tip:</b> If you use this repeatedly, e.g. while reworking your code, the cached DataFrame may have already lost its content and you need to refresh the cells before. Otherwise, this results in an error message.
</div>

Dropping/removing columns works similar except for the fact that you have to change the orientation of the elements to be dropped - use the axis parameter as in the following example to achieve this: 

In [None]:
df.drop('Unnamed: 0', axis=1)

Dropping a row if a certain value is not found can be achieved like this:

In [None]:
df[df.art_comment == 9]

Conversely, we remove all the rows if a certain string is missing:

In [None]:
res = df[df.art_id == 86]
type(res)

In [None]:
df[pd.np.nan != df.art_content]