# PySpark part 6

## StorageLevel

StorageLevel decides how RDD should be stored. In Apache Spark, StorageLevel decides whether RDD should be stored in the memory or should it be stored over the disk, or both. It also decides whether to serialize RDD and whether to replicate RDD partitions.

The following code block has the class definition of a StorageLevel −

```python
class pyspark.StorageLevel(useDisk, useMemory, useOffHeap, deserialized, replication=1)
```
### Terminology

Now, to decide the storage of RDD, there are different storage levels, which are given below −

- `DISK_ONLY` = StorageLevel(True, False, False, False, 1)
- `DISK_ONLY_2` = StorageLevel(True, False, False, False, 2)
- `MEMORY_AND_DISK` = StorageLevel(True, True, False, False, 1)
- `MEMORY_AND_DISK_2` = StorageLevel(True, True, False, False, 2)
- `MEMORY_AND_DISK_SER` = StorageLevel(True, True, False, False, 1)
- `MEMORY_AND_DISK_SER_2` = StorageLevel(True, True, False, False, 2)
- `MEMORY_ONLY` = StorageLevel(False, True, False, False, 1)
- `MEMORY_ONLY_2` = StorageLevel(False, True, False, False, 2)
- `MEMORY_ONLY_SER` = StorageLevel(False, True, False, False, 1)
- `MEMORY_ONLY_SER_2` = StorageLevel(False, True, False, False, 2)
- `OFF_HEAP` = StorageLevel(True, True, True, False, 1)

### Example

Let us consider the following example of StorageLevel, where we use the storage level MEMORY_AND_DISK_2, which means RDD partitions will have replication of 2.



In [1]:
from pyspark import SparkContext
import pyspark
sc = SparkContext('local', 'StorageLevel App')
rdd = sc.parallelize([1,2])
rdd.persist(pyspark.StorageLevel.MEMORY_AND_DISK_2)
print(rdd.getStorageLevel())

Disk Memory Serialized 2x Replicated
