## **D3TOP - Tópicos em Ciência de Dados (IFSP Campinas)**
**Prof. Dr. Samuel Martins (@iamsamucoding @samucoding @xavecoding)** <br/>
xavecoding: https://youtube.com/c/xavecoding <br/><br/>

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.

<hr/>

# Strings

### Strings are also sequences

In [3]:
# this is the Python community convention
name_1 = 'Luke Skywalker'
name_2 = "Darth Vader"

print(name_1)
print(name_2)

Luke Skywalker
Darth Vader


In [4]:
type(name_1)

str

In [5]:
# return the first character
name_1[0]

'L'

In [6]:
# return the 7th character
name_1[6]

'k'

In [7]:
# return the penultimate character
name_1[-2]

'e'

In [9]:
# return the first four characters
name_1[:4]

'Luke'

In [10]:
# string length
len(name_1)

14

#### Strings are immutable

In [11]:
name = 'Luke skywalker'

In [12]:
name[5]

's'

In [13]:
name[5] = 'X'

TypeError: 'str' object does not support item assignment

To replace a character in a specific position, we'll need to do a "workaround" that we'll see right away.

### String Methods

`.upper()` returns a copy of the string with all uppercase letters

In [14]:
name.upper()

'LUKE SKYWALKER'

In [15]:
# original string is not changed
name

'Luke skywalker'

In [16]:
name_upper = name.upper()
print(name)
print(name_upper)

Luke skywalker
LUKE SKYWALKER


In [17]:
'Luke skywalker'.upper()

'LUKE SKYWALKER'

`.lower()` returns the string with all lowercase letters

In [18]:
name_lower = name.lower()
print(name)
print(name_lower)

Luke skywalker
luke skywalker


In [20]:
name_upper_to_lower = name_upper.lower()
print(name_upper)
print(name_upper_to_lower)

LUKE SKYWALKER
luke skywalker


`str.startswith(substring)` checks if the string `str` starts with the substring `substring`.

In [21]:
print(name)
print(name_upper)
print(name_lower)

Luke skywalker
LUKE SKYWALKER
luke skywalker


In [22]:
name.startswith('Luke')

True

In [23]:
name_upper.startswith('Luke')

False

In [24]:
name_upper.startswith('LUKE')

True

`str.endswith(substring)` checks whether the string `str` ends with the substring `substring`.

In [25]:
print(name)
print(name_upper)
print(name_lower)

Luke skywalker
LUKE SKYWALKER
luke skywalker


In [26]:
name.endswith('r')

True

In [27]:
name.endswith('walker')

True

In [28]:
name.endswith('WALKER')

False

In [29]:
img_filename = 'image.png'

# check if the file extension is '.png'
if img_filename.endswith('.png'):
    print("It's PNG")
else:
    print("NO PNG")

It's PNG


In [30]:
img_filename = 'image.PNG'

# check if the file extension is '.png'
if img_filename.endswith('.png'):
    print("It's PNG")
else:
    print("NO PNG")

NO PNG


In [31]:
img_filename = 'image.PNG'

# check if the file extension is '.png'
# if img_filename.endswith('.png') or img_filename.endswith('.PNG'):
if img_filename.lower().endswith('.png'):
    print("It's PNG")
else:
    print("NO PNG")

It's PNG


In [32]:
img_filename = 'image.png'

# check if the file extension is '.png'
# if img_filename.endswith('.png') or img_filename.endswith('.PNG'):
if img_filename.lower().endswith('.png'):
    print("It's PNG")
else:
    print("NO PNG")

It's PNG


`string.split(substring)` divides a string, from a substring passed, returns a list with the divisions.

In [49]:
directory = '~/documents/files'

subdirs = directory.split('/')
subdirs

['~', 'documents', 'files']

In [40]:
img_filename = 'image.PNG'
img_filename_without_ext = img_filename.lower().split('.png')[0]

print(img_filename)
print(img_filename_without_ext)

image.PNG
image


In [42]:
# if the separator doesn't exist, .split returns a list with a single element: the own string
directory.split('ABC')

['~/documents/files']

`join` joins a list of strings from a substring.

In [53]:
print(subdirs)

full_directory = '/'.join(subdirs)

print(full_directory)

['~', 'documents', 'files']
~/documents/files


In [54]:
# "workaround" to change a single character of a given index of a string
name = 'Luke skywalker'
print(name)

Luke skywalker


In [55]:
name[5] = 'S'

TypeError: 'str' object does not support item assignment

In [56]:
name_as_list = list(name)
name_as_list

['L', 'u', 'k', 'e', ' ', 's', 'k', 'y', 'w', 'a', 'l', 'k', 'e', 'r']

In [58]:
name_as_list[5] = 'S'
name_as_list

['L', 'u', 'k', 'e', ' ', 'S', 'k', 'y', 'w', 'a', 'l', 'k', 'e', 'r']

In [59]:
''.join(name_as_list)

'Luke Skywalker'

`replace` replace one specific substring with another specific substring.

In [60]:
txt = "I like bananas"
print(txt)

I like bananas


In [61]:
new_txt = txt.replace('bananas', 'apples')

print(txt)
print(new_txt)

I like bananas
I like apples


#### Concatenating strings

In [62]:
print(directory)
print(img_filename)

~/documents/files
image.PNG


In [63]:
full_path = directory + '/' + img_filename

print(full_path)

~/documents/files/image.PNG


<br/>

Python provides a module to _handle OS pathnames_: `os.path`

In [64]:
import os

In [65]:
print(directory)
print(img_filename)

~/documents/files
image.PNG


In [66]:
os.path.join(directory, img_filename)

'~/documents/files/image.PNG'

<br/>

Theare other interesting functions of the `os.path` module:

In [67]:
# list the files from the current directory ('.')
os.listdir('.')

['basic_of_strings_base_code.ipynb',
 'basic_of_regular_expressions.ipynb',
 'basic_of_strings.ipynb',
 'basic_of_files_base_code.ipynb',
 '.ipynb_checkpoints',
 'basic_of_regular_expressions_base_code.ipynb',
 'demos',
 'basic_of_files.ipynb']

In [68]:
# check if the file "basic_of_strings.ipynb" exists
os.path.exists('basic_of_strings.ipynb')

True

In [69]:
# check if the file "samuka.nice" exists
os.path.exists('samuka.nice')

False

## Formatted String Literals (f-strings)

In Python 3.6, the <strong>f-strings</strong> feature was introduced, which provides several advantages compared to the older `.format()` string method. <br/>
One of these benefits is the ability to directly incorporate external variables into the string, instead of passing them as keyword arguments.

In [70]:
name = 'Luke skywalker'

# Using the old .format() method:
print('My favorite Jedi is {var}.'.format(var=name))
print('My favorite Jedi {0}.'.format(name))

My favorite Jedi is Luke skywalker.
My favorite Jedi Luke skywalker.


In [71]:
# Using f-strings:
print(f'My favorite Jedi {name}.')

My favorite Jedi Luke skywalker.


Pass `!r` to get the <strong>string representation</strong>:

In [72]:
print(f'My favorite Jedi {name!r}.')

My favorite Jedi 'Luke skywalker'.


Take precautions to prevent the **quotation marks** in the replacement fields from conflicting with the quotation marks utilized in the outer string.

In [74]:
person = {
    'name': 'Luke skywalker',
    'age': 16
}

In [75]:
person

{'name': 'Luke skywalker', 'age': 16}

In [76]:
print(f'Name: {person['name']}')

SyntaxError: f-string: unmatched '[' (<ipython-input-76-b56171ea29fd>, line 1)

In [77]:
print(f"Name: {person['name']}")

Name: Luke skywalker


In [78]:
print(f'Name: {person["name"]}')

Name: Luke skywalker


### Minimum Widths, Alignment and Padding
It is possible to pass arguments within nested curly braces to specify a minimum field width, alignment, and even the padding characters.

In [88]:
movies = [('Title', 'Genre', 'Year'), ('The Godfather', 'Drama', 1972), ('Star Wars', 'Fantasy', 1977), ('Psycho', 'Terror', 1960)]

for mov in movies:
    print(f'{mov[0]:{15}} {mov[1]:{10}} {mov[2]:{8}}')

Title           Genre      Year    
The Godfather   Drama          1972
Star Wars       Fantasy        1977
Psycho          Terror         1960


In [89]:
movies = [('Title', 'Genre', 'Year'), ('The Godfather', 'Drama', 1972), ('Star Wars', 'Fantasy', 1977), ('Psycho', 'Terror', 1960)]

for title, genre, year in movies:
    print(f'{title:{15}} {genre:{10}} {year:{8}}')

Title           Genre      Year    
The Godfather   Drama          1972
Star Wars       Fantasy        1977
Psycho          Terror         1960


In [95]:
title, genre, year = movies[0]

print(f'{title:{15}} {genre:{10}} {year:{8}}')
print(f'{"-" * 15} {"-" * 10} {"-" * 8}')

for title, genre, year in movies[1:]:
    print(f'{title:{15}} {genre:{10}} {year:{8}}')

Title           Genre      Year    
--------------- ---------- --------
The Godfather   Drama          1972
Star Wars       Fantasy        1977
Psycho          Terror         1960
