### Strings
Strings are sequential collections of zero or more letters, numbers and other symbols. We call these letters, numbers and other symbols characters. Literal string values are differentiated from identifiers by using quotation marks (either single or double).

In [2]:
'Python'

'Python'

In [3]:
name = 'Python'

In [15]:
# Length of the string
len(name)

6

Since strings are sequences, all of the sequence operations described above work as you would expect. In addition, strings have a number of methods. For example,

In [16]:
# Change to upper case
name.upper()

'PYTHON'

In [17]:
name

'Python'

In [18]:
# Change to lower case
name.lower()

'python'

#### Split a string

In [19]:
name.split('t') # split at t

['Py', 'hon']

Split is very useful for processing data. split will take a string and return a list of strings using the character specified as a division point. In the example, t is the division point. If no division is specified, the split method looks for whitespace characters such as tab, newline and space.

In [20]:
# For example
a = 'I love Python'
a.split()

['I', 'love', 'Python']

### Other String operations

#### Right, Left and Center justification

In [21]:
name.center(10) #Returns a string center-justified in a field of size w

'  Python  '

In [22]:
name.rjust(10) #Returns a string right-justified in a field of size w

'    Python'

In [23]:
name.ljust(10) #Returns a string left-justified in a field of size w

'Python    '

#### lower case

In [24]:
name_upper = 'PYTHON'

In [25]:
name_upper.lower()

'python'

#### Find first occurence of a an item

In [26]:
name.find('p')

-1

In [27]:
name.find('P')

0

The find function is case-sensitive and that's the reason why the method returns -1 for lower case 'p' and the correct index '0' of 'P'.

#### We can also search for other strings within our string

In [28]:
substring = 'th'

In [29]:
name.find(substring)

2

The sequence 'th' starts at the index position 2.

#### When item is not found, it returns a -1 index

In [30]:
name.find('d')

-1

## Interesting!!

A major difference between lists and strings is that lists can be modified while strings cannot. This is referred to as mutability. Lists are mutable; strings are immutable. For example, we can change an item in a list by using indexing and assignment. With a string that change is not allowed.

In [31]:
name[3] = 'r'

TypeError: 'str' object does not support item assignment

### String Comparisons
Comparing strings is one of the most important String operations to learn and something that we will do more often.

There are two types of comparissons in general in Python using "is" and "=="

In [32]:
a='this is a very long string'
b='this is a very long string'

In [33]:
# The first operation using 'is' may or may not result in True based on where these items, 
# i.e., strings, are stored in memory.
a is b

False

In [34]:
# For equality check, we should actually be using "=="
a == b

True

### Use of id()
The id() function returns identity (unique integer) of an object.

In [35]:
# Checking, id() shows that they are two different objects.
id(a)

1877868068576

In [36]:
id(b)

1877868068656

We can see that the two objects are different even though they contain the same data.

Now lets compare two integers with <b>"is"</b>

In [37]:
'3' == '3'

True

In [46]:
a = 3
b = 3

In [47]:
a is b

True

In [48]:
a == b

True

Now that you have seen this, lets try larger numbers and check for equality again

In [41]:
a = 1211231232
b = 1211231232

In [42]:
a is b

False

In [43]:
a == b

True

### Interesting!!
Now that's strange!! Though they are referring to the same numbers in memory, the "is" check resulted in False. Why??

### Caching of small integers

Small values like 3 are "cached" (an implementation detail!) in CPython for efficient memory management. For the purpose of saving memory, they're used often; large values like 1211231232 are not cached -- , it is 100% an implementation detail. It is best that we are warned not to depend on such behavior!

Hence when a and b with large numbers are compared, it results in false when "is" is used to compare (because they are stored in different locations) but "==" behaves the way we expect. Hence it is best to use "==" for comparison operations.

#### Lets check how caching works using id()

In [43]:
a = 3
b = 3
a is b

True

In [44]:
id(a)

1934762736

In [45]:
id(b)

1934762736

#### Turns out "3" has been cached and its memory location is 1934762736

In [46]:
# We'll try it with bigger numbers now
a = 12311412415
b = 12311412415

In [47]:
a is b

False

In [48]:
id(a)

3015504058448

In [49]:
id(b)

3015504058672

#### So, as expected, the big numbers haven't been cached and are stored at different locations

### Other examples:

#### Comparing String Objects

In [61]:
k1 = 'bob'
k2 = 'bob'

In [62]:
k1 is k2

True

This is because strings are immutable in Python, and therefore it can safely have strings pointing to the same id, since the string will never get modified. 

#### Comparing Datetime objects

In [50]:
import datetime
datetime.date.today() == datetime.date.today()

True

In [51]:
datetime.date.today is datetime.date.today()

False

That resulted in false as they are still two different datetime objects. 

Lets check how caching works in Lists with small integer elements.

In [52]:
A = [1, 2, 3, 4]
B = A[0:2]

In [53]:
id(A) == id(B)

False

In [54]:
id(A[0]) == id(B[0])

True

CPython caches small integers so any value from -5 to 256 will have same id anytime we check it.

When we execute the statement B = A[0:2], it ends up essentially doing this, as part of it: B[0] = A[0] i.e. assigning the 0th index in A to 0th index of B.
    
So the object (the integer 1) in A[0] is the same object which is in B[0].

CPython caches small integers. So we've got A[0] == 1 == B[0], and id(1) == id(1).

### General Understanding

if x is y then x==y is also True

It is very important to know that this is not same as 

if x==y then x is y

Which means, we should use "==" when comparing values and "is" when comparing identities. (Also, from the persepctive of the language(English), "equals" is different from "is".)

### Summary of this:
is : used for identity testing (identical 'objects')

== : used for equality testing (~~ identical value)

### Few things to know about Memory allocation and CPython 
-- CPython is the reference implementation of the Python programming language. Written in C, CPython is the default and most widely used implementation of the language. CPython is an interpreter.
	
cpython allocates from a heap that gets scrambled up as objects are malloc'd and free'd. As we may see in different elements of one list getting stored at different locations in the memory that are not even close to each other or have an incremental nature to it.

Python numbers are not simple pieces of data. They are objects that use longs internally to begin with, then auto-promote to a BigNumber-style representation if the value gets too large. 

Just because we store (say) a 32bit int in a data structure in a scripting language doesn't mean we'll end using up 32bits of memory. There's ALWAYS metadata attached to ANY data we store such as type, size, length, et al.

Knowing Python and knowing a particular implementation (e.g. CPython) are two entirely different things. And even knowing CPython inside out won't help, as there are several memory managers CPython calls upon that aren't part of CPython but part of the respective operating system. 

### So what does id() return??

It is "an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime." (Python Standard Library - Built-in Functions) A unique number. Nothing more, and nothing less. Think of it as a social-security number or employee id number for Python objects.

### Is it the same with memory addresses in C?

Conceptually, yes, in that they are both guaranteed to be unique in their universe during their lifetime. And in one particular implementation of Python (CPython), it actually is the memory address of the corresponding C object.

### Check the Implementation of Python we are using

In [82]:
import platform    
platform.python_implementation()

'CPython'

## Replace in String

#### replace()
The method replace() returns a copy of the string in which the occurrences of old have been replaced with new, optionally restricting the number of replacements to max.

In [55]:
str = "this is probably the most fun thing I have ever did. It literally is!!"
print(str.replace("is", "was"))
print(str.replace("is", "was", 1))

thwas was probably the most fun thing I have ever did. It literally was!!
thwas is probably the most fun thing I have ever did. It literally is!!


## Miscellaneous Operations

#### istitle()
The method istitle() checks whether all the case-based characters in the string following non-casebased letters are uppercase and all other case-based characters are lowercase.

In [4]:
str = "This Is A String Example...Wow!!!";
print(str.istitle())
str = "This is a string example....wow!!!";
print(str.istitle())

True
False
