## conjuntos de dados

Este módulo tem as funções necessárias para ser capaz de baixar vários conjuntos de dados úteis que possam estar interessados ​​em usar em nossos modelos.

In [None]:
from fastai.gen_doc.nbdoc import *
from fastai.datasets import * 
from fastai.datasets import Config
from pathlib import Path

In [None]:
show_doc(URLs)

<h2 id="URLs" class="doc_header"><code>class</code> <code>URLs</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L8" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#URLs-pytest" style="float:right; padding-right:10px">[test]</a></h2>

> <code>URLs</code>()

<div class="collapse" id="URLs-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#URLs-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>URLs</code>:</p><p>Some other tests where <code>URLs</code> is used:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L43" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Global constants for dataset and model URLs.  

Este contém todos os conjuntos de dados e modelos de URLs, e alguns classmethods para ajudar a usá-los - você não criar objetos dessa classe. Os conjuntos de dados suportados são (com seu nome chamando): `S3_NLP`,` S3_COCO`, `MNIST_SAMPLE`,` MNIST_TINY`, `IMDB_SAMPLE`,` ADULT_SAMPLE`, `ML_SAMPLE`,` PLANET_SAMPLE`, `CIFAR`,` PETS` , `MNIST`. Para obter detalhes sobre os conjuntos de dados você pode ver a [fast.ai datasets webpage](http://course.fast.ai/datasets). Conjuntos de dados com a amostra em seu nome são subconjuntos dos conjuntos de dados originais. No caso de MNIST, também temos um conjunto de dados TINY que é ainda menor do que MNIST_SAMPLE.
Modelos está agora limitada a `WT103` mas você pode esperar mais no futuro!

In [None]:
URLs.MNIST_SAMPLE

'http://files.fast.ai/data/examples/mnist_sample'

## dados download

Para o resto dos conjuntos de dados você precisará baixá-los com [`untar_data`](/datasets.html#untar_data) ou [`download_data`](/datasets.html#download_data). [`untar_data`](/datasets.html#untar_data) irá descompactar o arquivo de dados e baixá-lo enquanto [`download_data`](/datasets.html#download_data) só vai baixar e salvar o arquivo compactado no formato `.tgz`.
Os locais onde os dados e modelos são baixados são definidos em `config.yml`, que por padrão está localizado em` ~ / .fastai`. Este diretório pode ser alterado através da variável de ambiente opcional `FASTAI_HOME` (por exemplo FASTAI_HOME = / home / .fastai).
Se nenhum `config.yml` está presente no diretório especificado, um padrão será criado com` data_archive_path`, `data_path` e` entradas models_path`. A `` data_path` e entradas models_path` apontam respectivamente para [`data`](/tabular.data.html#tabular.data) pasta e pasta [`models`](/tabular.models.html#tabular.models) no mesmo directório como `config.yml`. O `data_archive_path` permite que você defina uma pasta separada para salvar conjuntos de dados compactados para fins de arquivamento. O padrão é o mesmo diretório `data_path`.
Configurar esses locais de download editando `data_archive_path`,` data_path` e `models_path` em` config.yml`.

In [None]:
show_doc(untar_data)

<h4 id="untar_data" class="doc_header"><code>untar_data</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L221" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#untar_data-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>untar_data</code>(**`url`**:`str`, **`fname`**:`PathOrStr`=***`None`***, **`dest`**:`PathOrStr`=***`None`***, **`data`**=***`True`***, **`force_download`**=***`False`***) → `Path`

<div class="collapse" id="untar_data-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#untar_data-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>untar_data</code>:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_load_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L26" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L42" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_vision_data.py::test_trunc_download</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_vision_data.py#L165" class="source_link" style="float:right">[source]</a></li></ul><p>Some other tests where <code>untar_data</code> is used:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L43" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Download `url` to `fname` if `dest` doesn't exist, and un-tgz to folder `dest`.  

Em geral, [`untar_data`](/datasets.html#untar_data) usa um `url` baixar um` tgz` arquivo sob `fname`, e depois un-tgz` fname` em uma pasta sob `dest`.
Se você tiver executado [`untar_data`](/datasets.html#untar_data) antes, em seguida, executando `untar_data (URLs.something)` novamente só vai fazê-lo retornar `dest` sem baixar novamente.
Se você tiver executado [`untar_data`](/datasets.html#untar_data) antes, em seguida, executando [`untar_data`](/datasets.html#untar_data) novamente com `force_download = true` ou o arquivo tgz sob` fname` ser corrompido de alguma forma, irá remover o fname` existente `e` dest` e iniciar o download novamente.
Se você tiver executado [`untar_data`](/datasets.html#untar_data) antes, mas `dest` não existe, ou seja, nenhuma pasta sob` dest` existe (a pasta pode ser removido ou mudado o nome de alguma forma), em seguida, executando `untar_data (URLs.something)` novamente irá executar [`download_data`](/datasets.html#download_data). Além disso, se o arquivo tgz sob `fname` existe, então não haverá nenhuma transferência real em vez de un-tgz` fname` em `dest`; se `fname` não existe, em seguida, fazer o download para o arquivo tgz será realmente executado.
** Nota **: o `url` você alimenta a [`untar_data`](/datasets.html#untar_data) deve ser um dos` URLs.something`.

In [None]:
untar_data(URLs.PLANET_SAMPLE)

PosixPath('/home/ubuntu/.fastai/data/planet_sample')

In [None]:
show_doc(download_data)

<h4 id="download_data" class="doc_header"><code>download_data</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L206" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#download_data-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>download_data</code>(**`url`**:`str`, **`fname`**:`PathOrStr`=***`None`***, **`data`**:`bool`=***`True`***, **`ext`**:`str`=***`'.tgz'`***) → `Path`

<div class="collapse" id="download_data-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#download_data-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>download_data</code>:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_load_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L26" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L42" class="source_link" style="float:right">[source]</a></li></ul><p>Some other tests where <code>download_data</code> is used:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L43" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Download `url` to destination `fname`.  

Nota: Se o arquivo de dados já existe no <code> dados </ code> diretório dentro do notebook, esse arquivo de dados será usado em vez de um presente na pasta especificada em `config.yml`. `Config.yml` está localizado no directório especificado no ambiente opcional variáveis` FASTAI_HOME` (padrão para `~ / .fastai /`). Caminhos são resolvidos chamando a função [`datapath4file`](/datasets.html#datapath4file) - que verifica se existe dados localmente ( `data /`) em primeiro lugar, antes de baixar para a pasta especificada em `config.yml`.
Exemplo:

In [None]:
download_data(URLs.PLANET_SAMPLE)

PosixPath('/home/ubuntu/.fastai/data/planet_sample.tgz')

In [None]:
show_doc(datapath4file)

<h4 id="datapath4file" class="doc_header"><code>datapath4file</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L199" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#datapath4file-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>datapath4file</code>(**`filename`**, **`ext`**:`str`=***`'.tgz'`***, **`archive`**=***`True`***)

<div class="collapse" id="datapath4file-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#datapath4file-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>datapath4file</code>:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_load_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L26" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L42" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Return data path to `filename`, checking locally first then in the config file.  

Todas as funções de download usar isso para decidir onde colocar o tgz e pasta expandida. Se `filename` já existe em <code> dados </ code> no mesmo lugar como o chamando notebook / script, que é usado como o pai diretamente, caso contrário` config.yml` é lido para ver o caminho a ser usado , cujo padrão é <code> ~ / .fastai / dados </ code> é usado. Para substituir esse padrão, simplesmente modifique o valor em seu `config.yml`:
data_archive_path: ~ / .fastai / dados
data_path: ~ / .fastai / dados
`Config.yml` está localizado no directório especificado no ambiente opcional variáveis` FASTAI_HOME` (padrão para `~ / .fastai /`).

In [None]:
show_doc(url2path)

<h4 id="url2path" class="doc_header"><code>url2path</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L186" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#url2path-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>url2path</code>(**`url`**, **`data`**=***`True`***, **`ext`**:`str`=***`'.tgz'`***)

<div class="collapse" id="url2path-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#url2path-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>url2path</code>:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_load_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L26" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L42" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Change `url` to a path.  

In [None]:
show_doc(Config)

<h2 id="Config" class="doc_header"><code>class</code> <code>Config</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L129" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-pytest" style="float:right; padding-right:10px">[test]</a></h2>

> <code>Config</code>()

<div class="collapse" id="Config-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>Config</code>:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_creates_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L15" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_default_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L29" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_load_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L26" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L42" class="source_link" style="float:right">[source]</a></li></ul><p>Some other tests where <code>Config</code> is used:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_user_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L43" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Creates a default config file 'config.yml' in $FASTAI_HOME (default `~/.fastai/`)  

Você provavelmente não vai precisar usar isso sozinho - ele é usado por `URLs.datapath4file`.

In [None]:
show_doc(Config.get_path)

<h4 id="Config.get_path" class="doc_header"><code>get_path</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L144" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-get_path-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>get_path</code>(**`path`**)

<div class="collapse" id="Config-get_path-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-get_path-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>get_path</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Get the `path` in the config file.  

Obter a chave correspondente a `path` na [`Config`](/datasets.html#Config).

In [None]:
show_doc(Config.data_path)

<h4 id="Config.data_path" class="doc_header"><code>data_path</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L149" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-data_path-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>data_path</code>()

<div class="collapse" id="Config-data_path-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-data_path-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>data_path</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Get the path to data in the config file.  

Obter o `Path` onde os dados são armazenados.

In [None]:
show_doc(Config.model_path)

<h4 id="Config.model_path" class="doc_header"><code>model_path</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L159" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-model_path-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>model_path</code>()

<div class="collapse" id="Config-model_path-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-model_path-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>model_path</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Get the path to fastai pretrained models in the config file.  

## Indocumentados Métodos - Métodos movidos abaixo desta linha irá intencionalmente ser escondido

In [None]:
show_doc(Config.create)

<h4 id="Config.create" class="doc_header"><code>create</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L172" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-create-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>create</code>(**`fpath`**)

<div class="collapse" id="Config-create-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-create-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>create</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Creates a [`Config`](/datasets.html#Config) from `fpath`.  

In [None]:
show_doc(url2name)

<h4 id="url2name" class="doc_header"><code>url2name</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L183" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#url2name-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>url2name</code>(**`url`**)

<div class="collapse" id="url2name-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#url2name-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>url2name</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

In [None]:
show_doc(Config.get_key)

<h4 id="Config.get_key" class="doc_header"><code>get_key</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L139" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-get_key-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>get_key</code>(**`key`**)

<div class="collapse" id="Config-get_key-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-get_key-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>No tests found for <code>get_key</code>. To contribute a test please refer to <a href="/dev/test.html">this guide</a> and <a href="https://forums.fast.ai/t/improving-expanding-functional-tests/32929">this discussion</a>.</p></div></div>

Get the path to `key` in the config file.  

In [None]:
show_doc(Config.get)

<h4 id="Config.get" class="doc_header"><code>get</code><a href="https://github.com/fastai/fastai/blob/master/fastai/datasets.py#L164" class="source_link" style="float:right">[source]</a><a class="source_link" data-toggle="collapse" data-target="#Config-get-pytest" style="float:right; padding-right:10px">[test]</a></h4>

> <code>get</code>(**`fpath`**=***`None`***, **`create_missing`**=***`True`***)

<div class="collapse" id="Config-get-pytest"><div class="card card-body pytest_card"><a type="button" data-toggle="collapse" data-target="#Config-get-pytest" class="close" aria-label="Close"><span aria-hidden="true">&times;</span></a><p>Tests found for <code>get</code>:</p><p>Some other tests where <code>get</code> is used:</p><ul><li><code>pytest -sv tests/test_datasets.py::test_creates_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L15" class="source_link" style="float:right">[source]</a></li><li><code>pytest -sv tests/test_datasets.py::test_default_config</code> <a href="https://github.com/fastai/fastai/blob/master/tests/test_datasets.py#L29" class="source_link" style="float:right">[source]</a></li></ul><p>To run tests please refer to this <a href="/dev/test.html#quick-guide">guide</a>.</p></div></div>

Retrieve the [`Config`](/datasets.html#Config) in `fpath`.  

## Novos Métodos - Por favor, documento ou mover para a seção em situação irregular