<a href="https://colab.research.google.com/github/da3344ma-s-jpg/Lectures/blob/main/lectures/05-strings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Strings

String are of type `str` and are used to store text.

## Creation

In [None]:
'this is a string'

In [None]:
str('this is also a string')

In [None]:
"so is this" # here we used "" instead of ''

In [21]:
type(_) # get the type of the last output

str

In [None]:
a = '''we can also
create a string over
multiple lines''' # we assign the string to a variable `a`
a

Notice that each newline (`\n`) is saved in the string and is included when printing. Strings can also be added together.

In [None]:
print(a + ' and we can add more \nstuff')

String can also contain so-called unicode characters which is an international standard for letters, symbols, and emojies.

In [None]:
print('\N{ghost}', '\N{greek small letter gamma}') # unicode, see here: https://unicode.org/emoji/charts/full-emoji-list.html

## Common Operations, Indexing and Slicing

Let's illustrate a few operations we can perform on strings. The [Python documentation](https://docs.python.org/2/library/stdtypes.html#string-methods) will give you a comprehensive list of things to do.

In [None]:
a = 'All work and no play makes Jack a dull boy'

In [None]:
len(a) # number of characters

In [None]:
a.replace('dull boy', 'dullboy') # a is not changed, but a new str is returned

In [None]:
a.split(' ') # create a list of words using space (' ') as a delimiter. More on lists later.

With _indexing_ and _slicing_ we can for example extract parts of the string or reverse it. The format for slicing is `str[begin:end:step]` where `begin` is inclusive and `end` is exclusive, i.e. `[begin:end[`. By default the `step` is 1. More information [here.](https://www.digitalocean.com/community/tutorials/how-to-index-and-slice-strings-in-python-3)

In [None]:
a[0]   # first letter in string. Note that in python, counting always starts at index zero (0)

In [None]:
a[-1]  # last letter in string. W. negative index we count backwards

In [None]:
a[0:3] # index (letter) [0 to 3[.

In [None]:
a[1:3] # index number [1 to 3[.

In [None]:
a[::2] # skip every second letter

In [None]:
a[::-1] # reverse the whole string

In [None]:
a*4 # repeat n times

## Exercises

### Part 1

On a single line of code, slice `a` to extract the word "play" and then reverse it to form "yalp"

In [23]:
a = "play"
print(a[3::-1])

yalp


### Part 2

Explain each line in the following code and make a new version with the same triangular layout but where the letters are printed to the _right_ of the smiley. Figure out what `\N`, and `end` does.

In [25]:
i=1
substring = a[0:20]
for letter in substring:
    print(' ' * i + letter, end = ( len(substring) - i ) * '\N{grinning face}' + '\n')
    i += 1

 p😀😀😀
  l😀😀
   a😀
    y


Explicação linha a linha:
	1.	i = 1
Define a variável i que vai controlar quantos espaços imprimir antes da letra.
	2.	substring = a[0:20]
Cria uma substring com os primeiros 20 caracteres de a.
	3.	for letter in substring:
Itera sobre cada letra da substring.
	4.	print(' ' * i + letter, end = ( len(substring) - i ) * '\N{grinning face}' + '\n')
	•	' ' * i → imprime i espaços.
	•	+ letter → imprime a letra atual.
	•	O argumento end substitui o valor padrão ("\n") por algo customizado:
	•	(len(substring) - i) * '\N{grinning face}' → imprime uma série de carinhas 😀, diminuindo a cada linha.
	•	+ '\n' → garante que a linha acaba com quebra de linha.
	5.	i += 1
Incrementa i para adicionar mais espaços na próxima linha.


In [26]:
i = 1
substring = a[0:20]
for letter in substring:
    print((len(substring) - i)*'\N{grinning face}' + ' '*i + letter)
    i += 1

😀😀😀 p
😀😀  l
😀   a
    y


### Part 3

Create a _function_ that takes a string as _argument_ and returns the same string with every second character in UPPERCASE. For example, `hejsa` should become `HeJsA`. Equip your function with a descriptive [_docstring_](https://www.datacamp.com/community/tutorials/docstrings-python). Hint: the modulo operator, `%` can be used to check for odd and even number.


In [27]:
def capitalize_every_second(input_string):
  """
  Capitalizes every second character of a string.

  Args:
    input_string: The string to process.

  Returns:
    The processed string with every second character capitalized.
  """
  result = ""
  for i, char in enumerate(input_string):
    if i % 2 != 0: # Check if the index is odd (every second character)
      result += char.upper()
    else:
      result += char
  return result

# Example usage:
print(capitalize_every_second("hejsa"))
print(capitalize_every_second("hello world"))

hEjSa
hElLo wOrLd


## Convertion to and from Numbers
Strings can be converted to and from numbers. Without control we can use the str() function.

In [28]:
s1 = '2.21'     # a string
f = float(s1)   # str->float
s2 = str(f)     # float->str
print(s1, type(s1))
print(f, type(f))
print(s2, type(s2))

2.21 <class 'str'>
2.21 <class 'float'>
2.21 <class 'str'>


To convert to `int`, the string must represent an integer and thus cannot contain decimals.

In [29]:
for s in ['10', '5.5']: # list of strings
    if s.isdigit():
        i = int(s)
        print(i, type(i))
    else:
        print(f'cannot convert "{s}" to int')

10 <class 'int'>
cannot convert "5.5" to int


In this part we have already used one of the two automatic ways how to combine text and numbers (with converting them)
This can be a little bit confusing since there has recently been a change. If you are googling different solutions you will stumble over both ways to convert.

It is important to note that since we are humans a lot of what we are interacting are strings (text). So this is something quite useful. Spending a bit more time on this also will pay of later when we talk about e.g. how to format the labels in a plot.  

### Old string replacement method

This is _not recommended_, but good to know for understanding legacy Python code.
In the old string replacement method the general synthax is a string (here bold) followed by a '%()' that contains the values that are to be placed into the string in the order they appear in the text.

**'This calculations returned %keyword1 as a result for %keyword2 iterations'**%(value_for_keyword1,value_for_keyword2)

"keyword1" an "keyword2" then needs to indicate what format you want the number to have.
Some that are commonly used are:

* %i to place an integer
* %s to place a string
* %f to place a float,  %.3f  to place a float with 3 digits after the comma
* %g that is a universal choice that chooses either a float or an exponential  

In [None]:
'the value %.2f is only the first of %i entries with 2 after comma digits'%(5.,10)

### New string replacement methods (recommended!)

Using the new methods we don't need to specify the data type. Merely place a wavy bracket and use either `format` _or_ an [f-string](https://docs.python.org/3/tutorial/inputoutput.html):

~~~ python
'message: cannot convert "{}" to int'.format(s)

f'message: cannot convert "{s}" to int' # this is an f-string (recommended!)
~~~

In [30]:
'the value {} is only the first of {} entries'.format(5,10)

'the value 5 is only the first of 10 entries'

Also here there are ways to define number of digits.<br>
out if lazyness I will use the numpy package to get the value of pi

In [31]:
import numpy as np   # a package for numerical calculations; and contains pi
f'The number pi {np.pi:.3} is here limited to a total of 3 digits.'

'The number pi 3.14 is here limited to a total of 3 digits.'

The new formating has the advantage that also lists etc. can be formated automatically.

In [32]:
mylist = [1, 2, 3]
f'I can print the list {mylist} with this method in one go.'

'I can print the list [1, 2, 3] with this method in one go.'

### Stepwise string creation

finally, as you can combine different strings with "+" you can also create strings stepwise. In this case it is

In [33]:
my_string = f'π = {np.pi:.3f}' + ' and we used unicode to create the symbol'
my_string

'π = 3.142 and we used unicode to create the symbol'

## Task
Adjust the following code to generate the following text using any of the methods above.

    The number pi 3.1 is here limited to a total of 2 digits.
    The number pi 3.14 is here limited to a total of 3 digits.
    The number pi 3.142 is here limited to a total of 4 digits.

In [34]:
print('The number pi xxx is here limited to a total of xxx digits.')
print('The number pi xxx is here limited to a total of xxx digits.')
print('The number pi xxx is here limited to a total of xxx digits.')

The number pi xxx is here limited to a total of xxx digits.
The number pi xxx is here limited to a total of xxx digits.
The number pi xxx is here limited to a total of xxx digits.


In [37]:
import numpy as np

print(f'The number pi {np.pi:.2} is here limited to a total of 2 digits.')
print(f'The number pi {np.pi:.3} is here limited to a total of 3 digits.')
print(f'The number pi {np.pi:.4} is here limited to a total of 4 digits.')

The number pi 3.1 is here limited to a total of 2 digits.
The number pi 3.14 is here limited to a total of 3 digits.
The number pi 3.142 is here limited to a total of 4 digits.


In [39]:
np.pi
import scipy.constants as sc

In [40]:
sc.atomic_mass

1.66053906892e-27