# Loading data from an HTTP server

MLDB gives users full control over where and how data is persisted. MLDB handles multiple protocol for URLs (see [Files and URLs](../../../../doc/#builtin/Url.md.html)). In this tutorial, we provide examples to load files via:
* <code> file:// </code> for a file inside an MLDB container (see the [Loading Data Tutorial](../../../../ipy/notebooks/_tutorials/_latest/Loading%20Data%20Tutorial.ipynb) for an example)
* <code> http:// </code> or <code> https://</code> for a file accessible on a HTTP server on the public internet or a private intranet 

The notebook cells below use `pymldb`'s `Connection` class to make [REST API](../../../../doc/#builtin/WorkingWithRest.md.html) calls. You can check out the [Using `pymldb` Tutorial](../../../../doc/nblink.html#_tutorials/Using pymldb Tutorial) for more details.

In [2]:
from pymldb import Connection
mldb = Connection()

## Loading data with http:// or https://
MLDB makes it very easy to load with the public web. A powerful feature of MLDB allows us to load publicly avaialbe data directly from a web address. The datasets below are from the [Stanford Network Analysis Project (SNAP)](http://snap.stanford.edu/index.html). Here, we have chosen to load [Facebook's Social Circles dataset](http://snap.stanford.edu/data/egonets-Facebook.html). In our first example, we will be [`importing a text file`](../../../../doc/#builtin/procedures/importtextprocedure.md.html).

In [3]:
dataUrl = "http://snap.stanford.edu/data/facebook_combined.txt.gz"

print mldb.put("/v1/procedures/import_data", {
    "type": "import.text",
    "params": {
        "dataFileUrl": dataUrl,
        "headers": ["node", "edge"],
        "delimiter": " ", 
        "quotechar": "",
        "outputDataset": "import_URL1",
        "runOnCreation": True
    }
})

<Response [201]>


If the targeted file is inside an archive, we can specify the specific path to the file, as seen in the example below. Here, we load the '3980.circles' file within the 'facebook' folder.

In [8]:
dataUrl = "http://snap.stanford.edu/data/facebook.tar.gz"

print mldb.put("/v1/procedures/import_data", {
    "type": "import.text",
    "params": {
        "dataFileUrl": "archive+" + dataUrl + "#facebook/3980.circles",
        "headers": ["circles"],
        "delimiter": " ", 
        "quotechar": "",
        "outputDataset": "import_URL2",
        "runOnCreation": True
    }
})

<Response [201]>


Note: you will notice that we have grouped all of our rows in a single column. [Using Javascript inside MLDB's Query Language statements with the built-in jeval function](../../../../ipy/notebooks/_tutorials/_latest/Using%20the%20jseval%20function%20inside%20an%20SQL%20statement.ipynb), we will structure the data in a nicer way.

With support for multiple protocol types, MLDB provides powerful features for extracting data that resides in the public web. As seen in this tutorial, you can even pinpoint the exact file(s) to load within an archive's folder structure, allowing flexible data management. 