### Desafio: Forbes

---

**Definição do problema:** Você está trabalhando em um jornal e o editor quer que você responda as seguintes questões:  

<ol>
    <li> Qual a empresa mais valiosa da lista da Forbes 2000? </li>
    <li> Construa uma tabela com as TOP 10 empresas mais lucrativas da lista. </li>
    <li> Qual a média de valores de empresa das cinco categorias mais valiosas?  </li>
</ol>

**Código do dataset:** Forbes2000

**Dicas:**
<ul>
    <li> Método .max() </li>
    <li> Função nlargest() </li>
    <li> Método .groupby() </li>
</ul>

---

**Passo 1:** Importar as bibliotecas

In [1]:
import pandas as pd
from pydataset import data

**Passo 2:** Importar o dataset e verificar sua estrutura 

In [2]:
''' Mostra a documentação para entender melhor cada coluna '''
data('Forbes2000', show_doc=True)

Forbes2000

PyDataset Documentation (adopted from R Documentation. The displayed examples are in R)

##  The Forbes 2000 Ranking of the World's Biggest Companies (Year 2004)

### Description

The Forbes 2000 list is a ranking of the world's biggest companies, measured
by sales, profits, assets and market value.

### Usage

    data("Forbes2000")

### Format

A data frame with 2000 observations on the following 8 variables.

rank

the ranking of the company.

name

the name of the company.

country

a factor giving the country the company is situated in.

category

a factor describing the products the company produces.

sales

the amount of sales of the company in billion USD.

profits

the profit of the company in billion USD.

assets

the assets of the company in billion USD.

marketvalue

the market value of the company in billion USD.

### Source

http://www.forbes.com, assessed on November 26th, 2004.

### Examples

    data("Forbes2000", package = "HSAUR")
    summary(Forbes2000)


In [3]:
''' Salva o DataFrame na variável e imprime as 10 primeiras ocorrências '''
df = data('Forbes2000')
df.head(10)

Unnamed: 0,rank,name,country,category,sales,profits,assets,marketvalue
1,1,Citigroup,United States,Banking,94.71,17.85,1264.03,255.3
2,2,General Electric,United States,Conglomerates,134.19,15.59,626.93,328.54
3,3,American Intl Group,United States,Insurance,76.66,6.46,647.66,194.87
4,4,ExxonMobil,United States,Oil & gas operations,222.88,20.96,166.99,277.02
5,5,BP,United Kingdom,Oil & gas operations,232.57,10.27,177.57,173.54
6,6,Bank of America,United States,Banking,49.01,10.81,736.45,117.55
7,7,HSBC Group,United Kingdom,Banking,44.33,6.66,757.6,177.96
8,8,Toyota Motor,Japan,Consumer durables,135.82,7.99,171.71,115.4
9,9,Fannie Mae,United States,Diversified financials,53.13,6.48,1019.17,76.84
10,10,Wal-Mart Stores,United States,Retailing,256.33,9.05,104.91,243.74


**Passo 3:** Qual a empresa mais valiosa da lista da Forbes 2000? 

In [4]:
''' Retorna todas as empresas que tem como o valor máximo da coluna 'marketvalue' '''
empresa_mais_valiosa = df[df['marketvalue'] == df['marketvalue'].max()]
empresa_mais_valiosa

Unnamed: 0,rank,name,country,category,sales,profits,assets,marketvalue
2,2,General Electric,United States,Conglomerates,134.19,15.59,626.93,328.54


**Passo 4:** Construa uma tabela com o nome, país e lucro das TOP 10 empresas mais lucrativas da lista. 

In [5]:
df_mais_lucrativas = df.nlargest(10, columns=['profits'], keep='first')
df_mais_lucrativas = df_mais_lucrativas[['name', 'country', 'profits']]
df_mais_lucrativas

Unnamed: 0,name,country,profits
4,ExxonMobil,United States,20.96
1,Citigroup,United States,17.85
2,General Electric,United States,15.59
6,Bank of America,United States,10.81
5,BP,United Kingdom,10.27
20,Freddie Mac,United States,10.09
22,Altria Group,United States,9.2
10,Wal-Mart Stores,United States,9.05
31,Microsoft,United States,8.88
17,Total,France,8.84


**Passo 5:** Qual a soma de valores de empresa das cinco categorias mais valiosas?

In [6]:
''' Cria uma lista somando os valores de mercado por categoria. ''' 
df_categorias = df.groupby('category')['marketvalue'].sum()
''' Cria uma lista com as cinco categorias com maior valor somado. ''' 
df_categoria_top_5 = df_categorias.nlargest(5)
df_categoria_top_5

category
Banking                        3240.51
Oil & gas operations           1854.36
Drugs & biotechnology          1764.00
Diversified financials         1478.97
Telecommunications services    1464.42
Name: marketvalue, dtype: float64