## General notions

- The cardinality (number of elements) of any set X is indicated as |X|. It can be finite or infinite.
- Natural numbers $N$: {0, 1, ..., n}.
 - For a natural number n, [n] is the set {1, . . . , n}.
- Integers $Z$: {.., -2, -1, 0, 1, 2, .., n}.
- Rational number $Q$: is a number that can be expressed as the quotient or fraction $\frac{p}{q}$ of two integers, a numerator p and a non-zero denominator q.

## Strings

- The **set of all strings** over S of length exactly n ∈ N is indicated as $S^n$ 
   - example: if S={0,1} then $S^3$={(1,1,1),(1,1,0),(0,1,1),(1,0,1),(1,0,0),(0,1,0),(0,0,1),(0,0,0)}
   - $S^0$ = {ε}. Where ε is the empty string. Note that the empty string ε has cardinality 0, but the set containing an empty string is not an empty set! It has cardinality 1! (the set with cardinality 0 is the empty set {}= ∅). So |$\epsilon$|= 0, but |{$\epsilon$}|= 1
   
- The set of all strings over S is $\cup_{n=0} ^∞ S^n$ and is indicated as $S^∗$
  - example: if S={0,1} then $S^*$={ε,0,1,(0,0),(0,1),(1,0),(1,1),(1,1,1),(1,1,0), ...}
  
- The concatenation of two strings x and y over S is indicated as xy. 
 
- The **length of a string** x is indicated as |x|

- An important class of functions from {0, 1}∗ to {0, 1}∗ are those whose range are strings of length exactly one, called **boolean functions**.
    - They are characteristic functions.
    - We identify such a function f with the subset Lf of {0, 1}∗ defined as follows:
        - Lf = {x ∈ {0, 1}∗ | f (x) = 1}.
        - **What is a Language**? A language is nothing more than subset of {0,1}*, so it is just a set of strings over an alphabet. A language can have a characteristic function associated. What is  a characteristic function? Is a Boolean function which "RECOGNIZE" the language == determines which are the element of it, basing on its output=1. (example of language: the set of binary strings whose parity is 0. Or the set of binary strings which are encoding prime numbers. Or the set of binary strings which are palindrome, etc.).

- Example: Given two sets A = {1, 2, 3} and B = {a, b, c} their Cartesian product is A × B = {<1,a>,<1,b>,<1,c>,<2,a>,<2,b>,<2,c>,<3,a>,<3,b>,<3,c>}.

## O(n), Ω(n), Θ(n)

Given two functions we compute the limit of their ratio:
- if it gives **infinite**: the numerator is Ω(denominator) and w(denominator)

- if it gives **zero**: the numerator is O(denominator) and o(denominator)

- if it gives a **constant**: the numerator is Θ(denominator), O(denominator) and Ω(denominator), and viceversa. (NOTE THAT LITTLE-o IS NOT TRUE HERE!).

## Turing Machine

- At each step, the machine: <br>
    1) Read the symbols under the k tape heads. <br>
    2) For the k−1 read-write tapes, replace the symbol with a new one, or leave it unchanged. <br>
    3) Change its state to a new one. <br>
    4) Move each of the k tape heads to the left or to the right (or stay in place). <br>

- Note that I can use one tape and use it as input, working and output tape. By definition I should not use the input tape to writing on it, but I can say that I assume this and proceed forward.

- **Increment of a number** costs O(2n), so **O(n)** where n is the length of the encoded input (0 0 1 1 for the 12 for example)-
- **Counting increment** depends on the length of the input string on which I am counting. If the input string has length 9 (1 0 0 1 1 0 0 0 0), I just need to store a counter of log(|n|) + 1 which is 4 (so at most I need to represent the number 9 which is 1 0 0 1). So the cost of the increment that depends on the input string is **log(|n| + 1)** with n the length of the input string. 
- **Multiplication** costs O(n^2) using GridMethod
- **Sum** costs O(n)

### How to prove that something is in P or FP:
As we did in some exercise you have different ways:
- implement/describe formally the TM.
- describe informally the TM
- write a pseudocode!! NOTE: WHEN YOU WRITE THE PSEUDOCODE ANYWAY YOU MUST THINK ABOUT THE TM.
    - EXAMPLE OF THIS: PROVE THAT THE PROBLEM OF MATCHING TWO STRINGS (if one is shorter it means to see if it is contained in the other one) IS IN P.
      Solution:
      ```python
      input = x,y   # i suppose that x>y
      
      i=0
      
      while(i< |x| -|y| +1) do:
           
           if (x[i:i+|y| -1] == y) return True
           else i+=1
      end
      
      return False
      ```     
      Once you've written the pseudocode, how can you say it does the computation in polynomial time??
      
      You must reason on it:
      - the number of iterations is <=|x|
      - for each iteration:
      
          - the instruction cost: With instruction is meant the comparison (x[i:i+|y| -1] == y). To do the comparison with a 2 working tape TM takes O(|y|).
          
          - the "intermediate result"= the intermediate results are all the variables needed like to count something, or to store partial results, in this case the intermediate result is only the variable "i". To increment the variable i in the worst case it takes O(log(|x|) because in the worst case i must represent the number |x| which can be represented with log(|x|) iterations.

So the overall cost can be O(|x| * |y|) which is anyway for sure less than O(|x,y|^2) so it is polynomial, or can be O(|x|log(|x|) which again is for sure less than O(|x,y|^2) so the overall cost of the process is polynomial.
Note: if the number of iterations and all the instructions and intermediate results are polynomially bounded wrt the INPUT length, then you can say that the algorithm works in polynomial time.

**EX 3.1**
Given a list A = [a1 , . . . , an] of natural numbers and a number v ∈ N, return an index
i ∈ {1, . . . , n} such that1 A[i] = v, if any, and return −1 otherwise. Show that the following problem is computable in polynomial time.

```python
    i = 0
    while i <= n do:
        if A[i] == v:
            return True
        else
            i += 1
    return False
```
**SOLUTION**

How to prove that this algorithm works in polynomial time? As seen in class, we do that in **four
steps**: <br>
1-  We encode the input as a binary string.
    - For that, we regard a list of elements [b1 , . . . , bn] as a ‘pair of pairs’ of the form (((b1 , b2), b3 , ..., bn). Given a pair (b1 , b2) of bitstrings, we can encode the pair (x,y) as the string x#y over the alphabet {0,1,#}, then encode the ones as 11,the zeros as 00 and the # as 10.

2-  We prove that the number of instructions of the algorithm is bounded by a polynomial wrt the length of the input.
    - 1 assignment (i ← 1).
    - n iterations of:
        - 1 assignment (i ← 1).
        - A conditional branching performing:
           - An equality check (A[i] = v),
           - Either a return instruction or an assignment (i ← i + 1).
    - 1 return instruction


    - Therefore, the number of the instruction is of the form b + c · n (b is given by the assignment of i and the return statement), for suitable constants b, c, and thus it is bounded by a polynomial wrt the length of the input.
 


3- We argue that each instruction can be simulated by a Turing machine in polynomial time.
    - In order to prove this, we have to argue that all the previous instructions can be simulated by a TM in polynomial time. For instance, an equality check can be simulated as follows. Say we have two values a and b stored in different portions of a tape of a TM. In order to check whether a is equal to b, the machine simply moves back and forth between a and b checking whether they are bitwise equal. This can be done in polynomial time with respect to the length of a and b. Moreover, the increment i ← i + 1 has been proven that can be done in polynomial time.

4- We show that all ‘intermediate’ data and results of the algorithm are bounded by a polynomial wrt the length of the input.
    - to prove this, we simply observe that the only intermediate value computed
    by our algorithm is i, which can be at most n (and therefore it is bounded by a polynomial wrt the length of the input).


**EX 3.2**
Sort a list A = [a1 , . . . , an ] of natural numbers. <br>
Show that the following problem is computable in polynomial time. <br>
```python
do {
    exchanges = 0
    i = 0
    while i <= len(A):
        if A[i + 1] < A[i]:
            A[i] <-> [i + 1]
            exchanges++
        i++        

}
while (exchanges > 0)
return A
```

**SOLUTION** <br>
1-  We encode the input as a binary string.
    - For that, we regard a list of elements [b1 , . . . , bn] as a ‘pair of pairs’ of the form (((b1 , b2), b3 , ..., bn). Given a pair (b1 , b2) of bitstrings, we can encode the pair (x,y) as the string x#y over the alphabet {0,1,#}, then encode the ones as 11,the zeros as 00 and the # as 10.
    
2- We prove that the number of instructions of the algorithm is bounded by a polynomial wrt the length of the input.
    - n interations of:
        - two assignments (exchanges and i)
        - n interations of:
            - A conditional branching performing:
                - An disequality check (A[i + 1] < A[i]),
                - If the disequality holds, do: 
                    - a swap (A[i + 1] <-> A[i])
                    - increment (exchanges++)
                - increment (i++)
    - 1 return instruction
    - Taking into account the worst case, which is a list reverse-ordered, the number of instructions is bounded to a quadratic function (n^2) wrt the length of the input. In particular the number of the instruction is of the form b + c · n^2, for suitable constants b, c, and thus it is bounded by a polynomial wrt the length of the input.
    
3- We argue that each instruction can be simulated by a Turing machine in polynomial time.
    - In order to prove this, we have to argue that all the previous instructions can be simulated by a TM in polynomial time. The disequality check (A[i + 1] < A[i]), increment (i ← i + 1) and swap have been proven that can be done in polynomial time.
    
4- We show that all ‘intermediate’ data and results of the algorithm are bounded by a polynomial wrt the length of the input.
    - to prove this, we simply observe that the only intermediate value computed
    by our algorithm are i and exchanges, which can be at most n (and therefore it is bounded by a polynomial wrt the length of the input).

## Explicit Turing Machines:
1) Show that the function Inc: N -> N such that Inc(n)=n+1 can be computed in linear time by giving an EXPLICIT construction of a TM computing the function.

Solution: 

- How would you solve the problem by using your mind? 
  - simply adding one, which means to transform from 1 to 0 all the ones which you scan until you reach the closest zero to the LSB!!! Reached that zero, change it into a 1, and halt.

- How do I encode N?
  - two easy ways are to represent it as typical : from MSB to LSB, for instance 12= 1100. But in this problem is more efficient the reverse encoding from LSB to MSB : 12= 0011. It's more efficient because in this way I scan less numbers to find the closest zero to the LSB.

- How many tapes? Each of them which role has?
  - since we simply need to convert numbers on the string, I can use **only one tape** (used as input, worktape and output). **Note that I can use one tape and use it as input, working and output tape. By definition I should not use the input tape to writing on it, but I can say that I assume this and proceed forward.**
  
- Write the transition function $\delta$ (here is chosen to go back to the starting cell to halt but is not necessary actually). REMEMBER TO TAKE INTO ACCOUNT ALSO THE CASE IN WHICH THE STRING IS ALL 1111.. SO YOU HAVE TO CONSIDER THE BLANK SYMBOL TO BE CONVERTED INTO A 1.
<img src="trans_fun.png" width=30% height=30%>
Explanation:
- for instance the first line means that if you are in the state q_init and the symbol that you read is ">", then the state becomes q0, you change ">" with ">" and you move the head tape to the Right.

- do not confuse the state with the tape head position, for instance let's reason on q0. you start from qinit and necessarly the next state is q0. If in the cell there's a 1 you move to the right, but you still are in the state q0! You are in q0 until a 0 is found. 

- it's useful to associate to each state a semantic. For instance q0 is the state you are until you haven't found a cell with 0. q1 is the state you are when you've already found a 0, so you're going back to change all the ones in zeros (because of the carry of the sum).

- is not mandatory to return to the initial cell to halt the machine.

- note that q2 and q1 actually can be merged also if they were created from a different idea (q1= coming back when you reach a 0, q2= coming back when you reach a blank), at the end they result to be redundant.

- to really understand it try a number and comput the increment with this transition function.

- As you can see in the worst case even with this (not super efficient) TM you can solve the problem by scanning the input twice, so 2n, which is O(n), so the problem can be computed in linear time.


2) Given a binary string x $\in$ {0,1}*, we indicate as $|x|_0$ and $|x|_1$ respectively the number of 0 and 1 occurring in x. (example: $|00101|_0=3$ $|00101|_1=2$). So the question is: Show that the language L is decidable in a time proportional to nlog(n), where L = { x$\in$ {0,1}^* | $|x|_0$ = $|x|_1$}  (which means to say if a string x contains the same number of 0 and 1).

Solution: 

- Here the input is already encoded.

- How would you solve the task?
  - a stupid solution can be to scan the string and keep track of the count of 0 in one variable and the count of 1 in another variable. So this make me think to use 2 working tapes.
  
- How many tapes? one for the input, 2 as working tapes, 1 for the output.

- Describe more in detail how the TM solves the task?
   - scan the input tape. At each cell is incremented or the count of 0 or the count of 1, with the same tecnique of the exercise 1. At the end the two working tapes must be compared bit by bit, if there's a bit in which they differ then is written 0 the output tape and the machines halts, if instead is reached the first blank symbol writes 1 and halts.

- What's the computational cost?

   - Since you want to apply the increment function on the working tapes you must initialize the first cell (after the ">" symbol) to 0!!! (otherwise the increment function is applied to a blank symbol and halts immediately). This has a constant cost. In particular in this case T(n) = 2. So we can say it is O(1).
   
   - You need to scan all the input tape. This takes T(n) = n. So you can say it is O(n).
   
     You also need to increment each working tape, PER EACH INPUT CELL SCANNED. The increment is done using the process described in exercise 1. We know that to do an increment of a string of length x it takes time O(x). In this case x is the length of the string in the working tape. We need to express x depending on n (the length of the input string). Suppose the worst case in which the input string is a string, of length n, of only ones. So in the worst case I need to count until the number n, how many bits are needed? log(n)+1. (example: to represent 8 you need 4 bits: 1000.) So for sure we can say that each increment takes O(log(n)). So the scan of the input and the counting overall takes O(n*log(n)).
   
   - You need to compare bit by bit the two working tapes. In the worst case you must scan all their length. Which is log(n) +1 , so this step costs O(log(n)).
  
  So the overall cost is: O(1) + O(nlog(n)) + O(log(n) = O(nlog(n)).

# Reductions
CLIQUE <=p SAT <br>
SAT <=p 3SAT <br>
SAT <=p SAT <br>
3SAT <=p CLIQUE <br>
2SAT is polynomaial so is the less difficult

## Exam exercises
**1SAT** <br>

We studied the classes SAT, and 3SAT, and 2SAT. You are required to classify the problem 1SAT containing all (and only) the satisfiable 1CNFs. In particular, to which complexity class does 1SAT belong? Prove your claim.

Answer: <br>

since 1SAT has only one literal per clause, namely it has only AND between N variables, we can just for loop all the variables and for each variable I check whether from the other N-1 variables is not present the same variable negated. Therefore 1SAT is in P.

For example: A^B^not C is 1SAT, while A^B^C^not A it not.

**CLIQUE** <br>

We studied the problem CLIQUE. You are required to classify the subset THREECLIQUE of CLIQUE consisting of all the pairs (G,3). To which class does THREECLIQUE belong?

One can test whether a graph G contains a k-vertex clique, and find any such clique that it contains, using a brute force algorithm. This algorithm examines each subgraph with k vertices and checks to see whether it forms a clique. It takes time O(nk k2), as expressed using big O notation. This is because there are O(nk) subgraphs to check, each of which has O(k2) edges whose presence in G needs to be checked. Thus, the problem may be solved in polynomial time whenever k is a fixed constant. However, when k does not have a fixed value, but instead may vary as part of the input to the problem, the time is exponential.