<a href="https://colab.research.google.com/github/UCD-Physics/Python-HowTos/blob/main/Importing_Data_Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Importing data from text files with Numpy

Being able to import data easily is hugely important in order to analyse and graph it. 

There are several ways to do this but in this section we will go over the basic way to do it using Numpy's `loadtxt()`.

For for information see the [Numpy loadtxt documentation](https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html) 

In [1]:
import numpy as np

## Download sample data files

Download the sample text files from Github by running the cell below (note: you need wget installed - it is already installed in Google colab but may not be on your own machine. In that case you can either install wget or just enter the URL of the files directly in a browser and save them):

In [None]:
!wget https://raw.githubusercontent.com/UCD-Physics/Python-HowTos/main/sample_data_1.txt
!wget https://raw.githubusercontent.com/UCD-Physics/Python-HowTos/main/sample_data_2.txt


It the following sections we assume the data has been downloaded and is in the same folder as this notebook.

## Numpy loadtxt()

Note: Numpy `loadtxt()` will skip comment lines (starting with a '#') by default.

The simplest usage is 
```python
data = np.loadtxt(filename)
```
where the data gets loaded into the variable called `data`.

## Loading data with a single column

to load data with a single 

In [2]:
#example data

data = np.loadtxt("sample_data_1.txt")
data

array([2.30770842, 2.30797343, 2.30820785, 2.30822824, 2.30824862,
       2.30822824, 2.30817727, 2.30859516, 2.30847285, 2.30875824,
       2.30882959, 2.30878882, 2.30891113, 2.27945513, 2.23846134,
       2.23949078, 2.24038771, 2.2413356 , 2.24198791, 2.24296638,
       2.24407735, 2.24496409, 2.24568775, 2.24687006, 2.24742045,
       2.24821546, 2.24935701, 2.25039663, 2.25076356, 2.25179299,
       2.25236376, 2.25304665, 2.25357665, 2.25420858, 2.25406589])

## Loading data with more than one column

There are two ways to load data with more than one column:
1. Load it as above into a single variable (it will be loaded as a 2D array) and then extract the columns
2. for every column specify a variable and use the `unpack=True` option (preferred!)

Both are illustrated below:

### Load as single 2D array and extract columns

In [3]:
# sample_data_2 has two columns, say voltage and current

data = np.loadtxt('sample_data_2.txt')  
voltage = data[:,0]
current = data[:,1]

print("Voltage: ",voltage)
print("Current: ",current)

Voltage:  [1. 2. 3. 4. 5. 6. 7. 8. 9.]
Current:  [10. 20. 30. 40. 50. 60. 70. 80. 90.]


### Load into individual arrays

In [4]:
voltage, current = np.loadtxt('sample_data_2.txt', unpack = 'true')

print("Voltage: ",voltage)
print("Current: ",current)

Voltage:  [1. 2. 3. 4. 5. 6. 7. 8. 9.]
Current:  [10. 20. 30. 40. 50. 60. 70. 80. 90.]


or

## More advanced options

There are also lots of options to include in the 'loadtxt' function. If the data you are importing has a header, you can skip this and only include numerical data by the option 
```python
skiprows = 1
```

---

If your data is separated by commas you can use the delimiter option to separate the data points
```python
delimiter =','
```

---

It can also be useful to specify what type of data is being read, e.g. float, integer, string 
```python
dtype = 'float, int, S24'
```
*note S24 means to expect a string with a max of 24 characters*

---

In order to only use certain columns, you can use the following function, remembering that it begins at 0
```python
usecols = (0,2,4) 
```
to read the first, third and fifth columns.

