# `Loader` Tutorial
   
Python has numerous ways to open files on your computer or download them from the internet. The Lexos `Loader` is a "helper" that invisibly takes care of many of the gotchas (like non-standard character encodings) so that you can get on with your work.

The `Loader` is in active development and has a number of different versions the most advanced version is currently the "smart" version, and you'll see this referenced in the `import` statement below.

You use the `Loader` by first instantiating a `Loader` class and then calling the `load()` function. This allows you to add texts to your loader multiple times.

## Import the `Loader` Module

In [2]:
from lexos.io.smart import Loader

## Load a Local File or a List of Local Files

Notice in the list below that you can load `.txt`, `.docx`, or `.pdf` formats.

When files are loaded into a `Loader`, their character encoding is automatically converted into UTF-8 format.

In [None]:
# A single file
data = "../test_data/txt/Austen_Pride.txt"

loader1 = Loader()
loader1.load(data)

# A list files
data = ["../test_data/txt/Austen_Pride.txt",
        "../test_data/docx/Austen_Sense_sm.docx",
        "../test_data/pdf/Austen_Pride_sm.pdf"]

loader2 = Loader()
loader2.load(data)


## Accessing Texts in a `Loader`

Texts are accessed with `loader.texts`. This is a list, so, if you wish to access a single text, you must do so by its index in the list (e.g. `loader.texts[0]`).

However, if you are using a loop, you can use `for text in loader` in addition to `for text in loader.texts`.

In the examples below, we print only the first 100 characters.

In [None]:
text = loader1.texts[0]
print(text[0:100])

print("===============================")

for text in loader2:
        print(text[0:100])

## Loading Local Directories or Zip Files   

Directories or zip files containing files of `.txt`, `.docx`, and `.pdf` extenstions can be loaded using the same technique.

In [None]:
# Get all the files in the docx directory
loader1 = Loader()
loader1.load("../test_data/docx")

# Get all the files in a zip file
loader2.load("../test_data/zip/txt.zip")

# Print the first 100 characters of the first file in the directory
print(loader1.texts[0][0:100])

# Print the first 100 characters of the first file in the zip file
print(loader1.texts[0][0:100])

## Load Texts from a URL

Use the same technique to download a text or texts from a url or a list of urls.

In [None]:
loader = Loader()
loader.load("https://www.gutenberg.org/files/84/84-0.txt")

print (loader.texts[0][0:1000])