<header id="nbis-notebooks">
<a href="https://nbis.se"><img src="https://nbisweden.github.io/PythonCourse/img/nbis.png" alt="NBIS" /></a>
<nav><a href="https://nbisweden.github.io/PythonCourse/ht17/"><i class="fa fa-fw fa-home"></i></a><a href="https://nbisweden.github.io/PythonCourse/ht17/topics"><i class="fa fa-fw fa-tasks"></i> Topics</a><a href="https://nbisweden.github.io/PythonCourse/ht17/project"><i class="fa fa-briefcase"></i> Project</a><a href="https://nbisweden.github.io/PythonCourse/ht17/preliminary"><i class="fa fa-fw fa-hourglass-start"></i> Preliminaries</a><a href="https://nbisweden.github.io/PythonCourse/ht17/help"><i class="fa fa-fw fa-support"></i> Help</a>
</nav>
<h1>Introduction to Python<small>HT17</small></h1>
</header>

Jupyter Notebooks are used to save the terminal output from the commands we potentially demonstrated in class.

It conveniently allows you to re-run the session.

You can also open a terminal, and start your `python` interpreter, by issuing the command:

```
$ python
Python 3.5.0 (default, Sep 25 2015, 16:02:14)
...
>>>
```

----

## Formatting strings



In [None]:
valueA = 3
valueB = 'pigs'
valueC = 'Story'

In [None]:
"The '{} little {}' book is a nice {}".format(valueA,valueB,valueC)

In [None]:
"The '{0} little {1}' book is a nice {2}".format(valueA,valueB,valueC)

In [None]:
"The '{0} little {0}' book is a nice {0}".format(valueA,valueB,valueC)

In [None]:
"The '{32} little {33}' book is a nice {34}".format(valueA,valueB,valueC)

In [None]:
"The '{num} little {name}' book is a nice {thing}".format(num=valueA,name=valueB,thing=valueC)

In [None]:
"The '{0} little {1}' book is a nice {2}".format(num=valueA,name=valueB,thing=valueC)

Positional arguments first, keyword arguments then

---

In [None]:
"The '{0} little {1}' book is a nice {2}".format(valueA,valueB,valueC)

Field width

In [None]:
"The '{0:10} little {1:20}' book is a nice {2:30}".format(valueA,valueB,valueC)

Alignment

In [None]:
"The '{0:<10} little {1:>20}' book is a nice {2:^30}".format(valueA,valueB,valueC)

Filling

In [None]:
"The '{0:å<10} little {1:Ä>20}' book is a nice {2:Ö^30}".format(valueA,valueB,valueC)

In [None]:
"|{0:-^20}|{1:-^20}|{2:-^20}|".format('Lines','Words','Characters')

In [None]:
"|{0:-^20}|{1:-^20}|{2:-^20}|".format(' Lines ',' Words ',' Characters ')

Conversion

In [None]:
"int: {0:d};  hex: {0:x};  oct: {0:o};  bin: {0:b}".format(42)

In [None]:
"int: {0:d};  hex: {0:#x};  oct: {0:#o};  bin: {0:#b}".format(42)

Rounding

In [None]:
points = 19
total = 22
'Score: {:.2}'.format( points / total )

In [None]:
'Score: {:.2%}'.format( points / total )

---

In [None]:
import datetime
d = datetime.datetime(2010, 7, 4, 12, 15, 58)
'{:%Y-%m-%d %H:%M:%S}'.format(d)

In [None]:
now = datetime.datetime.now()
'{:%Y-%m-%d %H:%M:%S}'.format(now)

---
Nesting arguments

In [None]:
width = 5
for num in range(5,12): 
    for base in 'dXob':
        print('Base {base}: {0:{width}{base}}\t'.format(num, base=base, width=width), end=' ')
    print()

---
Older syntax

In [None]:
'The "%d little %s" is a nice %s' % (valueA,valueB,valueC)

---
## Regular Expressions

In [None]:
import re

In [None]:
p = re.compile('ab*')
p

In [None]:
p = re.compile('ab*', re.IGNORECASE)
p

In [None]:
m = p.match('tempo')
print(m)

---
### Matching

In [None]:
p = re.compile('[a-z]+') # any letter but at least one.

In [None]:
result = p.match('Hello World!')

In [None]:
print(result)

In [None]:
result = p.match('tempo')
#print(result)

**result.group()**: Return the string matched by the RE

**result.start()**: Return the starting position of the match

**result.end()**: Return the ending position of the match

**result.span()**: Return both (start, end)

In [None]:
result.group()

In [None]:
result.start()

In [None]:
result.end()

---

In [None]:
p = re.compile('.*HELLO.*') # smg HELLO smg

In [None]:
m = p.match('gsdfgsdfgs  HELLO  __!@£§≈[|ÅÄÖ‚…’ﬁ]')

In [None]:
m.group()

In [None]:
p = re.compile('HELLO') # smg HELLO smg, and find position

m = p.match('gsdfgsdfgs  HELLO  __!@£§≈[|ÅÄÖ‚…’ﬁ]')

In [None]:
m.group()

**Typical code structure:**

```python
p = re.compile( ... )
m = p.match( 'string goes here' )
if m:
    print('Match found: ', m.group())
else:
    print('No match')
```

---
### Searching

In [None]:
p = re.compile('HELLO') # smg HELLO smg, and find position
m = p.search('gsdfgsdfgs  HELLO  __!@£§≈[|ÅÄÖ‚…’ﬁ]')

m.group()

In [None]:
m.start()

In [None]:
m.end()

In [None]:
m = p.search('gsdfgsdfgs  HeLLo  __!@£§≈[|ÅÄÖ‚…’ﬁ]  HELLO  ...ÖQ!§<>kds')

In [None]:
print(m)

In [None]:
m.start()

---
### Finding all the matching patterns

In [None]:
p = re.compile('HELLO') # smg HELLO smg, and find position
matches = p.findall('gsdfgsdfgs  HeLLo  __!@£§≈[|ÅÄÖ‚…’ﬁ]  HELLO  ...ÖQ!§<>kds')

In [None]:
print(matches)

In [None]:
matches = p.findall('gsdfgsdfgs  HELLO  __!@£§≈[|ÅÄÖ‚…’ﬁ]  HELLO  ...ÖQ!§<>kds')
print(matches)

In [None]:
for m in matches:
    print('Found {0:30} at position {1}'.format(m.group(), m.start()) )

---

In [None]:
objects = p.finditer('gsdfgsdfgs  HELLO  __!@£§≈[|ÅÄÖ‚…’ﬁ]  HELLO  ...ÖQ!§<>kds')
print(objects)

In [None]:
for m in objects:
    print('Found {0:^30} at position {1}'.format(m.group(), m.start()) )

In [None]:
for m in objects:
    print('Found {0:^30} at position {1}'.format(m.group(), m.start()) )

In [None]:
objects = p.finditer('gsdfgsdfgs  HELLO  __!@£§≈[|ÅÄÖ‚…’ﬁ]  HELLO  ...ÖQ!§<>kds')
for m in objects:
    print('Found {0:^30} at position {1}'.format(m.group(), m.span()) )

---
To describe the pattern to search for, we have a new language:
For example, **\d+** says any digits, but at least one

In [None]:
p = re.compile('\d+') # any digits
p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')

---
Ramesh example: RxLR, where x can be anything.

The pattern to search for is written: **R.LR** because . matches any character.

In [None]:
data = "MGCRYAVLALAVAYFAGSIAANDSQIVAVKGPASIRFTPAIHVVRGRFLRAANTADERNEDRGINLKSMPGFEKIASLFTKKNTPGPLLSWFEKKKSPDYVFLKLKINKGKQQLFDHPDWNVWVQYTTSVVKSDPEEAMIAALRTHYTDDILSKLLESAKNVPKTSGLATKMQMEHWVASKTPSQMFQFLRLDKVRNGVLDDPTLSIWINYMKLYNSKPVNKKQQVTLVSMLTTHYKDRGVLDIIEAAKKVPKTAPAARQLEMEQIQFWLKNGKSPDELLTVLSLDKAGNQLLASPRFKFWSKYVDNYNRDFPDEATTVMATLRNQLGDEDITPILIAAGKVPSTEKAAAKLQAEQFKSWLRENEDPAKVFQLLKLDNSADDLLGSPQFKLWGKYVEDLNLKPEHNDLQVSIITILRKNYGDDVLGNMVLAGKKAPSTSFMARRLEDELYKGWIAAGSSPDGVFKHLKFDKAGENVIQSPLWGLYTKFLEHYYKSFPTPMMSALAKGYDGDALAKLLIAAEKIPTSNTLATKLQTGQIQRWLDDKDQPGKIFKALLLDDMADDILTSPLFNTWTRYLDEFNKKFPDEKVSMTDTFRTSLDDETLKSLLITAKELPDMKTLSTKLQTVQIERWLASKTSPEDAFAVLALNKAGGNVLSKPLLNTWAAYLESFNAKFPRSRVSMIDTFREFFGDKALLTTLAAAKEVESTKKVATSLQDSLLSKWVLAKKPPSGVAKLVGTDEAGAKLLKTYTTKYMERYGQ"

In [None]:
p = re.compile('R.LR')
m = p.search(data)

In [None]:
m.group()

In [None]:
m.span()

After RxLR, we should also look for second pattern **EER**.


In [None]:
p2 = re.compile('ERR')
m2 = p2.search(data)

In [None]:
m2.start()

In [None]:
p = re.compile('R.LR.*ERR') # RxLR....anything....ERR
m = p.search(data)

In [None]:
if m:
    print('Match found: {} at position {}'.format(m.group(), m.span()) )
else:
    print('No match')

<div style="text-align:center;border:0;padding: 1em;margin: 0 auto;width: 60%;border-radius: 1em;box-shadow: 0px 0px 3px 0px rgba(0,0,0,0.75);">
Go to the <a href="https://github.com/NBISweden/PythonCourse/tree/vt17/regexp">RxLR example</a> on the course repository
</div>

----

In [1]:
from IPython.core.display import HTML
from urllib.request import urlopen
HTML('<style>{}</style>'.format(urlopen('https://nbisweden.github.io/PythonCourse/css/ipython.css').read().decode()))