<a href="https://colab.research.google.com/github/SCS-Technology-and-Innovation/IntroComp/blob/main/storage.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data storage

When data is stored on any computer system, it has to reside in some type of memory hardware. The persistent memory systems like hard drives and USB drives hold it even when the computer is turned off, but they are too slow to access to provide a smooth experience while the computer is actually carrying out a processing of some sort. Hence the computer also has another type of memory [RAM](https://en.wikipedia.org/wiki/Random-access_memory) (such as DDRAM, ROM, and flash) in which the processing itself takes place. This second type of memory is usually smaller in capacity that the storage media but much faster and possibly volatile.

When a computer carries out tasks, the processor (CPU) accesses information in the RAM. If some input needs to be read, it is read from storage onto RAM and then written back if needed.

In addition to *local* storage, one can have remote or cloud storage that cannot be accessed without using a *network interface* such as bluetooth or ethernet.

## File systems

When information is stored on a drive, it is organized into units such as partitions (divisions of a single physical device into two or more smaller "virtual" devices) which are furthermore organized into a recursive structure of folders that may contain files. Additionally, one can create **links** to make one file or folder appear as if it were also elsewhere with the hierarchy.

The local file system can be accessed in Python with the `os` package that interfaces with the operating system to gain such access.

In [1]:
import os

print(os.name) # let's see what operating system we are on

posix


WHen it says `posix`, it means "a flavor of [Linux](https://en.wikipedia.org/wiki/Linux)".

We can also check further detail and what the default *encoding* is using the `sys`package.

In [2]:
import sys

print(sys.platform)
print(sys.getfilesystemencoding())

linux
utf-8


The start of a file system is called the **root** directory. In Linux-like systems it called just `/`.

In [3]:
root = '/'

if os.getcwd() != root:
  print('Jumping to root')
  os.chdir('/') # we can jump to go to root if not there
else:
  print('Already at root')

Jumping to root


We can get a **directory listing** to see where we could navigate to:

In [4]:
for thing in os.listdir():
  print(thing)

proc
home
sys
lib
mnt
sbin
lib32
tmp
boot
etc
srv
run
var
opt
dev
media
root
usr
lib64
bin
libx32
kaggle
.dockerenv
datalab
tools
content
NGC-DL-CONTAINER-LICENSE
cuda-keyring_1.0-1_all.deb


Cool beans. Let's go `home` (that is where our usual files are supposed to reside).

In [5]:
os.chdir('home') # change current working directory

Check where we are...

In [6]:
os.getcwd() # get current working directory

'/home'

What's there?

In [7]:
for thing in os.listdir():
  print(thing)

Nothing. Colab is special and the files we see on the file browser are inside `/content/` instead of '/home/'.

In [8]:
os.chdir('..') # this means "back one level"
print(os.getcwd())
os.chdir('content')
for thing in os.listdir():
  print(thing)

/
.config
sample_data


This is precisely what we see. We do not see that `.config` file on the sidebar since filenames and folders that start with a dot are *hidden* by default.

Let's create a whole new directory in there. Note that if you run this code a second time, you see an error message because the directory will already exist.

It will take a while for the sidebar to sync, but the creation is instantaneous.

In [9]:
os.mkdir('cooltest')
print([ thing for thing in os.listdir() ])

['.config', 'cooltest', 'sample_data']


Let's create a new file in there.

In [12]:
with open('cooltest/newfile.txt', 'w') as target: # w means write mode
  print('hello', file = target)

Can we see it in there?

In [14]:
print([ thing for thing in os.listdir('cooltest/') ])

['newfile.txt']


We can rename files and folder.

In [15]:
os.rename('cooltest', 'newname')
os.chdir('newname')
os.rename('newfile.txt', 'anothername.txt')
print([ thing for thing in os.listdir() ])

['anothername.txt']


We can also *remove* files with `remove` and **empty** directories with `rmdir`.

In [16]:
os.remove('anothername.txt')
os.chdir('..')
os.rmdir('newname')
print([ thing for thing in os.listdir() ])

['.config', 'newfile.txt', 'sample_data']
