### Desafio: Importando dataset pelo pydataset

---

**Definição do problema:** Dado o código do dataset retorne as seguintes informações: 

<ol>
    <li> Importe o dataset utilizando a seguinte função do pydataset: data(“Código”) </li>
    <li> Imprimir na tela o dataset </li>
    <li> Informe o tipo de dados retornado pela função data; </li>
    <li> Informe o número de exemplos (linhas) e características (colunas) do dataset. </li>
    <li> Crie uma função que ao receber um DataFrame retorna o número de linhas e colunas </li>
</ol>

**Código do dataset:** plantTraits

**Dicas:**
<ul>
    <li> plantTraits é um dataset que contém a descrição, por meio de atributos biológicos, de diferentes espécies de plantas; </li>
    <li> função built-in (nativa da linguagem Python): type() </li>
    <li> atributo do DataFrame: shape </li>
</ul>

---

**Passo 1:** Importas as bibliotecas.

In [1]:
import pandas as pd
from pydataset import data

**Passo 2:** Importar o dataset.

In [2]:
''' Preenchendo o atributo show_doc da função data com True é mostrado a descrição do dataset '''
data('plantTraits', show_doc=True)

plantTraits

PyDataset Documentation (adopted from R Documentation. The displayed examples are in R)

## Plant Species Traits Data

### Description

This dataset constitutes a description of 136 plant species according to
biological attributes (morphological or reproductive)

### Usage

    data(plantTraits)

### Format

A data frame with 136 observations on the following 31 variables.

`pdias`

Diaspore mass (mg)

`longindex`

Seed bank longevity

`durflow`

Flowering duration

`height`

Plant height, an ordered factor with levels `1` < `2` < ... < `8`.

`begflow`

Time of first flowering, an ordered factor with levels `1` < `2` < `3` < `4` <
`5` < `6` < `7` < `8` < `9`

`mycor`

Mycorrhizas, an ordered factor with levels `0`never < `1` sometimes< `2`always

`vegaer`

aerial vegetative propagation, an ordered factor with levels `0`never < `1`
present but limited< `2`important.

`vegsout`

underground vegetative propagation, an ordered factor with 3 levels identical
to `vegaer` above.


In [3]:
''' Salvar o dataset em uma variável e exibir as 10 primeiras linhas '''
df = data('plantTraits')
df.head(10)

Unnamed: 0,pdias,longindex,durflow,height,begflow,mycor,vegaer,vegsout,autopoll,insects,...,seashiv,seasver,everalw,everparti,elaio,endozoo,epizoo,aquat,windgl,unsp
Aceca,96.84,0.0,2,7,5,2.0,0.0,0.0,0,4,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
Aceps,110.72,0.0,3,8,4,2.0,0.0,0.0,0,4,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
Agrca,0.06,0.666667,3,2,6,2.0,0.0,1.0,0,0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
Agrst,0.08,0.488889,2,2,7,1.0,2.0,0.0,0,0,...,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
Ajure,1.48,0.47619,3,2,5,2.0,2.0,0.0,1,3,...,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0
Allpe,2.33,0.5,3,5,4,0.0,0.0,0.0,3,3,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
Anaar,0.38,0.904762,3,2,6,2.0,0.0,0.0,3,2,...,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0
Anene,2.55,0.066667,3,2,3,2.0,0.0,2.0,1,3,...,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
Angsy,1.48,0.210526,3,3,7,2.0,0.0,0.0,0,3,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
Antod,0.52,0.369565,3,2,4,2.0,0.0,0.0,2,0,...,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0


**Passo 3:** Informar o tipo de dados do *dataset*.

In [4]:
''' A função built-in do Python retorna o tipo de dados de uma variável/objeto  '''
type(df)

pandas.core.frame.DataFrame

In [5]:
''' Outros exemplos utilizando a função type '''
print('O número 10 é do tipo %s já o número 10.0 é do tipo %s' % (type(10), type(10.0)))

O número 10 é do tipo <class 'int'> já o número 10.0 é do tipo <class 'float'>


**Passo 4:** Informar números de linhas e colunas do *dataset*.

In [11]:
df.shape

(1316, 4)

In [6]:
''' O atributo shape de um DataFrame retorna uma tupla (número de linhas, número de colunas)'''
print('O dataset plantTraits tem %d linhas e %d colunas.' % (df.shape))

O dataset plantTraits tem 136 linhas e 31 colunas.


In [7]:
''' Outra maneira de acessar a informação do shape e mostrar os resultados '''
print('O dataset plantTraits tem %d linhas e %d colunas.' % (df.shape[0], df.shape[1]))

O dataset plantTraits tem 136 linhas e 31 colunas.


In [8]:
''' Dica importante: Quando numa string (texto) é necessário representar aspas simples no texto, é necessário 
utilizar aspas duplas para representar a string, como no exemplo abaixo:'''
print("O dataset 'plantTraits' tem %d linhas e %d colunas." % (df.shape))

O dataset 'plantTraits' tem 136 linhas e 31 colunas.


**Passo 5:** Função que informa linhas e colunas de um DataFrame

In [15]:
def infoFrame(df):
    print("O DataFrame tem %d linhas e %d colunas." % (df.shape))

df = data('titanic')
infoFrame(df)

O DataFrame tem 1316 linhas e 4 colunas.
