# Indexing and sharding

## Index

For an ordered sequence, the value at the corresponding position can be accessed through the index method. Strings are an example of an ordered sequence, and **Python** uses `[]` to index the ordered sequence.

In [1]:
s = "hello world"
s[0]

'h'

Indexing in **Python** starts from `0`, so index `0` corresponds to the `1`th element of the sequence. To get the `5`th element, the index value `4` needs to be used.

In [2]:
s[4]

'o'

In addition to positive indexing, **Python** also introduces the use of negative index values, that is, counting from the back to the front. For example, the index `-2` means the `2`th element from the bottom:

In [3]:
s[-2]

'l'

When a single index is greater than or equal to the length of the string, an error will be reported:

In [4]:
s[11]

IndexError: string index out of range

## Fragmentation

Sharding is used to extract the desired subsequence from the sequence. Its usage is:

     var[lower:upper:step]

Its range includes `lower`, but does not include `upper`, that is, `[lower, upper)`, `step` represents the value interval size, if not, the default is `1`.

In [5]:
s

'hello world'

In [6]:
s[1:3]

'el'

The number of elements contained in the slice is `3-1=2` .

You can also use negative indexes to specify the range of shards:

In [7]:
s[1:-2]

'ello wor'

Includes index `1` but excludes index `-2`.

Lower and upper can be omitted. Omitting lower means fragmenting from the beginning, and omitting upper means fragmenting all the way to the end.

In [8]:
s[:3]

'hel'

In [9]:
s[-3:]

'rld'

In [10]:
s[:]

'hello world'

Take every second value:

In [11]:
s[::2]

'hlowrd'

When the value of step is negative, omitting lower means fragmenting from the end, and omitting upper means fragmenting all the way to the beginning.

In [12]:
s[::-1]

'dlrow olleh'

When the given upper exceeds the length of the string (note: because upper is not included, it can be equal to), Python will not report an error, but it will only count to the end.

In [13]:
s[:100]

'hello world'

## The reason for using "0" as the beginning of the index

### Reasons for using the form `[low, up)`

Suppose you need to represent the inner substring `el` in the string `hello`:

|Method|`[low, up)`|`(low, up]`|`(lower, upper)`|`[lower, upper]`
|--|--|--|--|--|
| means |`[1,3)`|`(0,2]`|`(0,3)`|`[1,2]`
|Sequence length|`up - low`|`up - low`|`up - low - 1`|`up - low + 1`

For length, the first two methods are better because there is no need for annoying additions and subtractions.

Now only consider the first two methods, assuming that you want to represent the substring `hel` starting from the beginning in the string `hello`:

|Method|`[low, up)`|`(low, up]`
|--|--|
| means |`[0,3)`|`(-1,2]`|
|Sequence length|`up - low`|`up - low`|

The second representation method starts from `-1`, which is not very good, so I choose to use the first form `[low, up)`.

### Use 0-base form

> Just too beautiful to ignore.  
----Guido van Rossum

Two simple cases:

- n elements starting from the beginning;
     - Use 0-base: `[0, n)`
     - Use 1-base: `[1, n+1)`

- The `i+1`th element to the `i+n`th element.
     - Use 0-base: `[i, n+i)`
     - Use 1-base: `[i+1, n+i+1)`

1-base has a `+1` part, so it is not recommended.

For these two reasons, **Python** uses the 0-base method for indexing.