# Two Sum problem

## Problem Definition

Given an array of integers, return indices of the two numbers such that they add up to a specific target.

## Coding Problem

The Two Sum problem is a common coding problem in computer programming. Given an array of integers and a target integer, the problem requires finding two numbers in the array that add up to the target integer. The solution to the problem involves finding the indices of the two numbers in the array that add up to the target integer.

For example, consider the following array of integers: [2, 7, 11, 15]. If the target integer is 9, then the solution to the Two Sum problem would be the indices of the two numbers that add up to 9, which are 0 and 1 (since 2 + 7 = 9).

The Two Sum problem can be solved in several ways, including using brute force algorithms, hashing, or two-pointers techniques. The problem is often used as a benchmark to compare the efficiency and performance of different programming algorithms.

## Brute Force Solution

### Brute Force Implementation

In [3]:
def two_sum_brute(nums, target):
    """
    Given an array of integers nums and an integer target, return indices of the two numbers such that
    they add up to target. Brute force solution.
    """
    n = len(nums)
    for i in range(n):
        for j in range(i+1, n):
            if nums[i] + nums[j] == target:
                return [i, j]
    # return None by default
    return []

In [2]:
my_list =  [2, 7, 11, 15]
two_sum_brute(my_list, 18)

[1, 2]

In [4]:
two_sum_brute(my_list, 318)

[]

In [6]:
import random
random_list_1k = [random.randint(1,10_000) for _ in range(1_000)]
random_list_1k[:10]

[6377, 5797, 1594, 19, 2883, 1632, 6078, 5248, 3076, 4507]

In [7]:
two_sum_brute(random_list_1k, 1000)

[102, 712]

In [9]:
random_list_1k[102], random_list_1k[712], random_list_1k[102] + random_list_1k[712]

(807, 193, 1000)

In [11]:
%%timeit
two_sum_brute(random_list_1k, 1000)

13.8 ms ± 3.95 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Complexity of Brute Force

The function two_sum takes an array of integers nums and a target integer target as input and returns a list of indices of the two numbers that add up to the target. The function uses two nested loops to check every possible pair of numbers in the array. If the sum of a pair equals the target, the function returns the indices of the pair.

The time complexity of this brute force solution is O(n^2), where n is the length of the input array. Therefore, this solution is not very efficient for large arrays.

Space complexity is O(1) since we are not using any extra arrays, any other data structures that are dependent on number of elements(n).

## Pointers Solution

The Two Sum problem can also be solved using the two-pointers technique. The idea is to use two pointers, one at the beginning of the array and the other at the end of the array, and then move the pointers towards each other until the sum of the values at the two pointers equals the target.


### Pointers Implementation

In [12]:
def two_sum_with_pointers(nums, target):
    """
    Given an array of integers nums and an integer target, return indices of the two numbers such that
    they add up to target. Two-pointer solution.
    """
    n = len(nums)
    left, right = 0, n-1
    while left < right:
        sum = nums[left] + nums[right]
        if sum == target:
            return [left, right]
        elif sum < target:
            left += 1
        else:
            right -= 1
    # nothing found
    return []

In [13]:
two_sum_with_pointers(my_list, 18)

[1, 2]

In [14]:
two_sum_with_pointers(my_list, 9018)

[]

In [15]:
# not a good idea to try this on unsorted
two_sum_with_pointers(random_list_1k, 1000)
# so we got no answer, when we know we do have an answer
# because we did not make the correct choice at some decision
# so we should consider sorting the list

[]

### Complexity of Pointers Solution

The function two_sum takes an array of integers nums and a target integer target as input and returns a list of indices of the two numbers that add up to the target. The function initializes two pointers, left and right, at the beginning and end of the array, respectively. It then compares the sum of the values at the two pointers with the target. If the sum equals the target, the function returns the indices of the two pointers. If the sum is less than the target, the left pointer is incremented, and if the sum is greater than the target, the right pointer is decremented. The function continues moving the pointers until it finds a pair that adds up to the target or until the pointers cross each other.

The time complexity of this two-pointer solution is O(n), where n is the length of the input array. Therefore, this solution is more efficient than the brute force solution for large arrays.

### Extra Requirements for Two Pointers Solution

The two-pointer technique can be used to solve the Two Sum problem if the input array is sorted. The idea is to use two pointers, one at the beginning of the array and the other at the end of the array, and then move the pointers towards each other until the sum of the values at the two pointers equals the target.

However, if the input array is not sorted, the two-pointer technique may not work correctly. For example, consider the following input array: [3, 4, 2, 7, 5] and the target integer 9. If we use the two-pointer technique on this unsorted array, we may not find a solution even though the solution exists (i.e., 2 and 7 add up to 9). This is because the two-pointer technique relies on the fact that the input array is sorted, which allows us to move the pointers in a specific way.

Therefore, if the input array is not sorted, we need to sort the array first before applying the two-pointer technique. The time complexity of sorting the array is O(n log n), where n is the length of the input array. After sorting the array, the two-pointer technique can be used to solve the Two Sum problem with a time complexity of O(n), where n is the length of the sorted array.

In [19]:
def two_sum_sorted(nums, target):
    """
    Given an array of integers nums and an integer target, return indices of the two numbers such that
    they add up to target. Two-pointer solution with sorting.
    """
    # Sort the input array
    nums_sorted = sorted(nums) # so this uses timsort which is O(n log n)
    # also we are using additional O(n) space since we keep the OG array nums
    
    # Initialize two pointers at the beginning and end of the array
    left, right = 0, len(nums_sorted) - 1
    
    # Move the pointers towards each other until the sum of the values at the two pointers equals the target
    while left < right:
        curr_sum = nums_sorted[left] + nums_sorted[right]
        if curr_sum == target:
            # Find the indices of the values in the original unsorted array
            index1 = nums.index(nums_sorted[left]) # 
            # index2 = nums.index(nums_sorted[right], index1 + 1) # starting search on index1+1 might not work
            index2 = nums.index(nums_sorted[right])
            # the above operations are going to cost some time - linear lookup so 
            # 2 x O(n) - crucially the number of lookups is not dependent on n
            # think of these as placed at the very end of the function
            return [index1, index2]
        elif curr_sum < target:
            left += 1
        else:
            right -= 1
    
    # If no solution is found, return an empty list
    return []

In [20]:
two_sum_sorted(random_list_1k, 1000)

[712, 102]

In [None]:
# so hint we can store some values to speed up our calculation
# so we would utilize ability of hash table (called dictionary in Python)
# to do O(1) storage and O(1) lookups - meaning no change in speed on million or billion items

In [None]:
# so idea is not to do the same calculation again

## Hashing Solution

The Two Sum problem can also be solved using hashing. The idea is to use a hash table (dictionary in Python) to store the values in the input array as keys and their indices as values. Then, for each value in the input array, we check if the complement (i.e., the difference between the target and the current value) exists in the hash table. If the complement exists, we have found a solution, and we return the indices of the current value and its complement.

### Hashing Implementation

In [None]:
def two_sum_hashing(nums, target):
    """
    Given an array of integers nums and an integer target, return indices of the two numbers such that
    they add up to target. Hashing solution.
    """
    hash_table = {}
    for i in range(len(nums)):
        complement = target - nums[i]
        if complement in hash_table: # key part this is O(1) look
            return [hash_table[complement], i]
        hash_table[nums[i]] = i

The function two_sum takes an array of integers nums and a target integer target as input and returns a list of indices of the two numbers that add up to the target. The function initializes an empty hash table hash_table and then loops through the input array. For each value in the array, the function computes its complement (i.e., the difference between the target and the value) and checks if the complement exists in the hash table. If the complement exists, the function returns the indices of the current value and its complement. If the complement does not exist, the function adds the current value and its index to the hash table. The time complexity of this hashing solution is O(n), where n is the length of the input array.

Note that if there are duplicate values in the input array, the hashing solution will still work correctly. The hash table will store the last index of each value in the array, so the function will always return the correct indices of the two numbers that add up to the target.

### Space Complexity of Hashing Solution

The space complexity of the hashing solution for the Two Sum problem is O(n), where n is the length of the input array. This is because the solution uses a hash table (dictionary in Python) to store the values in the input array as keys and their indices as values. The size of the hash table is proportional to the number of values in the input array, which is n. Therefore, the space complexity of the hashing solution is O(n).

In the worst case, all the values in the input array are unique, so the size of the hash table is equal to the length of the input array. In this case, the space complexity is O(n). However, in the best case, the input array contains only one or two values that add up to the target, so the size of the hash table is very small. In this case, the space complexity is much smaller than O(n).

Note that the space complexity of the hashing solution is higher than the space complexity of the two-pointer solution, which is O(1) because it does not require any extra data structures. However, the hashing solution has a better time complexity of O(n) compared to the two-pointer solution with sorting, which has a time complexity of O(n log n). Therefore, the choice of the solution depends on the requirements of the specific problem.

### Key Idea - Hashing - Trading Space for Time

The primary idea behind the hashing solution for the Two Sum problem is trading space complexity for time complexity. The hashing solution uses a hash table (dictionary in Python) to store the values in the input array as keys and their indices as values. This allows the solution to find the complement of each value in constant time on average, resulting in a time complexity of O(n).

However, this comes at the cost of a higher space complexity, which is O(n) in the worst case, where n is the length of the input array. The size of the hash table is proportional to the number of values in the input array, which is n. Therefore, the hashing solution trades space complexity for time complexity.

In contrast, the two-pointer solution does not require any extra data structures, resulting in a space complexity of O(1). However, the time complexity of the two-pointer solution depends on the sorting algorithm used to sort the input array. If an efficient sorting algorithm is used, the time complexity can be O(n log n) in the worst case.

Therefore, the choice between the two solutions depends on the requirements of the specific problem. If space is not a concern and a fast solution is required, the hashing solution may be a better choice. If space is limited or the input array is already sorted, the two-pointer solution may be a better choice.

### Two Sum - LeetCode

https://leetcode.com/problems/two-sum/

## Three Sum - Description

Given an array of integers nums and an integer target, return indices of the three numbers such that they add up to target.

The Three Sum problem can also be generalized to a target sum other than zero. The problem requires finding all unique triplets in an array of integers that add up to a given target integer.

Formally, given an array nums of n integers and a target integer target, the generalized Three Sum problem requires finding all unique triplets (nums[i], nums[j], nums[k]) such that i < j < k and nums[i] + nums[j] + nums[k] = target. The solution to the problem involves returning a list of all unique triplets that satisfy this condition.

For example, consider the following array of integers and target sum: nums = [-1, 0, 1, 2, -1, -4], target = 0. The solution to the generalized Three Sum problem would be the list of unique triplets that add up to the target sum, which is [-1, 0, 1] and [-1, -1, 2].

The generalized Three Sum problem can be solved using a variety of techniques, including brute force algorithms, hashing, and two-pointers techniques. However, the choice of the algorithm may depend on the specific requirements of the problem, such as the size of the input array and the range of the input values. The generalized Three Sum problem can be useful in a variety of applications, such as finding pairs of stocks that add up to a certain price or finding combinations of items that satisfy a certain cost or weight limit.

### Three Sum Brute Force - Implementation

In [None]:

def three_sum_brute(nums, target):
    """
    Given an array of integers nums and a target integer target, return a list of all unique triplets (nums[i], nums[j], nums[k])
    such that i < j < k and nums[i] + nums[j] + nums[k] = target. Brute force solution.
    """
    result = []
    for i in range(len(nums)):
        for j in range(i+1, len(nums)):
            for k in range(j+1, len(nums)):
                if nums[i] + nums[j] + nums[k] == target:
                    triplet = sorted([nums[i], nums[j], nums[k]])
                    if triplet not in result:
                        result.append(triplet)
    return result

### Three Sum Solution - Time Complexity

The three_sum function takes an array of integers nums and a target integer target as input and returns a list of all unique triplets that add up to the target. The function uses a brute force algorithm that involves iterating through all possible combinations of three elements in the input array and checking if their sum is equal to the target.

For each triplet that satisfies the condition, the function sorts the triplet in ascending order and adds it to the result list only if it is not already in the list. This ensures that the function returns a list of unique triplets.

The time complexity of this brute force solution is O(n^3), where n is the length of the input array. The space complexity of the solution is also O(n^3) because the function stores all possible triplets that satisfy the condition in the result list.

## Three Sum Pointer Solution

The Three Sum problem can also be solved using two-pointer techniques. The idea is to sort the input array and then use two pointers to find all unique triplets that add up to the target. The function three_sum takes an array of integers nums and a target integer target as input and returns a list of all unique triplets that add up to the target.

The Three Sum problem can also be solved using the two-pointers technique, which can reduce the time complexity to O(n^2) in the worst case. The idea is to sort the input array and then use two pointers to scan the array. The first pointer, i, iterates through the array, and for each i, we use two additional pointers, j and k, to scan the remaining subarray for pairs that add up to the target value.



### Three Sum Pointer Solution - Implementation



In [None]:
def three_sum_pointers(nums, target):
    """
    Given an array of integers nums and a target integer target, return a list of all unique triplets (nums[i], nums[j], nums[k])
    such that i < j < k and nums[i] + nums[j] + nums[k] = target. Two-pointer solution with sorting.
    """
    nums.sort()
    result = []
    for i in range(len(nums)):
        # Skip duplicates
        if i > 0 and nums[i] == nums[i-1]:
            continue
        j, k = i+1, len(nums)-1
        while j < k:
            curr_sum = nums[i] + nums[j] + nums[k]
            if curr_sum == target:
                result.append([nums[i], nums[j], nums[k]])
                # Skip duplicates
                while j < k and nums[j] == nums[j+1]:
                    j += 1
                while j < k and nums[k] == nums[k-1]:
                    k -= 1
                j += 1
                k -= 1
            elif curr_sum < target:
                j += 1
            else:
                k -= 1
    return result

The three_sum function takes an array of integers nums and a target integer target as input and returns a list of all unique triplets that add up to the target. The function first sorts the input array and then initializes two pointers, j and k, at the beginning and end of the remaining subarray for each value of i. The function moves the pointers towards each other until the sum of the values at the three pointers equals the target.

If the function finds a solution, it adds the triplet to the result list and skips any duplicates. The function then moves the pointers towards each other until they either cross or reach values that are different from their current values.

The time complexity of the two-pointer solution is O(n^2), where n is the length of the input array, due to the sorting step and the two-pointer traversal of the array. The space complexity of the solution is O(n) due to the result list.


See also: https://www.geeksforgeeks.org/find-a-triplet-that-sum-to-a-given-value/

## Three Sum - Hashing Solution

The Three Sum problem can also be solved using hashing, similar to the Two Sum problem. The idea is to use a hash table (dictionary in Python) to store the values in the input array as keys and their indices as values. Then, for each value in the input array, we check if there are two other values in the hash table whose sum with the current value equals the target value. If such values exist, we have found a solution, and we return the triplet of values.

### Three Sum Hashing Solution - Implementation



In [None]:
def three_sum_hashing(nums, target):
    """
    Given an array of integers nums and a target integer target, return a list of all unique triplets (nums[i], nums[j], nums[k])
    such that i < j < k and nums[i] + nums[j] + nums[k] = target. Hashing solution.
    """
    result = []
    for i in range(len(nums)):
        hash_table = {}
        for j in range(i+1, len(nums)):
            complement = target - nums[i] - nums[j]
            if complement in hash_table:
                triplet = sorted([nums[i], nums[j], complement])
                if triplet not in result:
                    result.append(triplet)
            hash_table[nums[j]] = j
    return result

### Three Sum Hashing Solution - Time Complexity

The three_sum function takes an array of integers nums and a target integer target as input and returns a list of all unique triplets that add up to the target. The function initializes two pointers, i and j, at the beginning of the array, and for each i and j, the function creates a hash table that stores the values and their indices in the remaining subarray. The function then computes the complement of the sum of nums[i] and nums[j] and checks if it exists in the hash table. If the complement exists, the function adds the triplet of values to the result list only if it is not already in the list.

The time complexity of the hashing solution is O(n^2) in the worst case because the function iterates through all possible pairs of values in the input array and checks if their complement exists in the hash table. The space complexity of the solution is O(n) due to the hash table and the result list.





## Three Sum Approaches Comparison

The two-pointer approach has a time complexity of O(n^2) and a space complexity of O(1), which makes it more efficient in terms of space than the hashing approach. However, the two-pointer approach requires the input array to be sorted, which may not be feasible in some cases, especially when the input is constantly changing. Also, the two-pointer approach may not be able to handle duplicates in the input array without additional modifications to the algorithm.

On the other hand, the hashing approach has a time complexity of O(n^2) and a space complexity of O(n), which makes it less efficient in terms of space than the two-pointer approach. However, the hashing approach does not require the input array to be sorted, and it can handle duplicates in the input array without any additional modifications to the algorithm.

Therefore, if the input array is already sorted, or if sorting the input array is acceptable and the space available is limited, the two-pointer approach may be more preferable. On the other hand, if the input array is not sorted, or if handling duplicates is important and space is not a concern, the hashing approach may be more preferable. Ultimately, the choice of the approach depends on the requirements and constraints of the specific problem.

## Links to Three Sum Problem

* https://en.wikipedia.org/wiki/3SUM
* https://www.geeksforgeeks.org/find-a-triplet-that-sum-to-a-given-value/
* https://www.leetcode.com/problems/3sum/



## Generalized Problem

The Three Sum problem is a special case of the generalized k-Sum problem, which involves finding all unique k-tuples that add up to a target value. The generalized k-Sum problem can be solved using a variety of techniques, including brute force algorithms, hashing, and two-pointers techniques. However, the choice of the algorithm may depend on the specific requirements of the problem, such as the size of the input array and the range of the input values. The generalized k-Sum problem can be useful in a variety of applications, such as finding pairs of stocks that add up to a certain price or finding combinations of items that satisfy a certain cost or weight limit.

## Recursive Solution to k-sum problem

The Generalized k-Sum problem is a variant of the Three Sum problem that requires finding all unique k-tuples of elements in an array of integers that add up to a given target integer. This problem can be solved using a recursive approach that reduces the k-Sum problem to the (k-1)-Sum problem, and so on, until we reach the Two Sum problem, which can be solved efficiently using hashing.



In [None]:
def k_sum_recursive(nums, target, k):
    """
    Given an array of integers nums, a target integer target, and a value k, return a list of all unique k-tuples of elements
    in nums that add up to target. Recursive solution.
    """
    if k == 2:  # Base case (Two Sum)
        return two_sum(nums, target)
    result = []
    for i in range(len(nums)):
        if i > 0 and nums[i] == nums[i-1]:  # Skip duplicates
            continue
        sub_results = k_sum_recursive(nums[i+1:], target-nums[i], k-1)
        for sub_result in sub_results:
            result.append([nums[i]] + sub_result)
    return result


def two_sum(nums, target):
    """
    Given an array of integers nums and a target integer target, return a list of all unique pairs (nums[i], nums[j])
    such that nums[i] + nums[j] = target. Hashing solution.
    """
    result = []
    hash_table = {}
    for i in range(len(nums)):
        complement = target - nums[i]
        if complement in hash_table:
            pair = sorted([nums[i], complement])
            if pair not in result:
                result.append(pair)
        hash_table[nums[i]] = i
    return result

### K-sum recursive solution - complexity analysis

The k_sum function takes an array of integers nums, a target integer target, and a value k as input and returns a list of all unique k-tuples of elements in nums that add up to target. The function uses a recursive approach that reduces the k-Sum problem to the (k-1)-Sum problem by iterating through the elements in the input array and recursively calling k_sum on the remaining subarray with a reduced target and value of k-1.

If k is equal to 2, the function calls the two_sum function, which uses a hash table to efficiently find all unique pairs of elements in the input array that add up to the target. The two_sum function returns a list of all unique pairs that satisfy the condition.

The time complexity of the recursive solution is O(n^(k-1)), where n is the length of the input array, due to the recursive calls to k_sum. The space complexity of the solution is O(n^k) in the worst case because the function stores all possible k-tuples that satisfy the condition in the result list.

### K-sum iterative solution

The k-Sum problem can also be solved iteratively, without using recursion. The idea is to use dynamic programming to solve the problem by building up the solution for larger k-tuples from the solution for smaller (k-1)-tuples. The solution for the k-tuple is then computed by iterating through the elements in the input array and adding the current element to each of the (k-1)-tuples in the solution for the (k-1)-tuple.

In [None]:
def k_sum_iterative(nums, target, k):
    """
    Given an array of integers nums, a target integer target, and a value k, return a list of all unique k-tuples of elements
    in nums that add up to target. Iterative solution.
    """
    nums.sort()
    dp = {(0, 0): [[] for _ in range(k+1)]}  # Base case (k=0, target=0)
    for num in nums:
        for i in reversed(range(1, k+1)):
            for j in reversed(range(num, target+1)):
                if (i-1, j-num) in dp:
                    for lst in dp[(i-1, j-num)]:
                        if lst and lst[-1] > num:
                            continue
                        dp[(i, j)] = dp.get((i, j), []) + [lst+[num]]
    return dp.get((k, target), [])

### K-sum iterative solution - complexity analysis

The k_sum function takes an array of integers nums, a target integer target, and a value k as input and returns a list of all unique k-tuples of elements in nums that add up to target. The function first sorts the input array and then initializes a dynamic programming dictionary dp that stores the solutions for smaller k-tuples and targets. The base case is when k=0 and target=0, in which case the function returns an empty list.

The function then iterates through the input array and builds up the solution for larger k-tuples from the solution for smaller (k-1)-tuples. The function uses nested loops to iterate over the values of i, j, and num, where i represents the size of the k-tuple, j represents the target sum, and num represents the current element in the input array. For each i, j, and num, the function checks if there is a solution for the (i-1)-tuple and (j-num) target in the dp dictionary. If such a solution exists, the function adds the current num to each of the solutions and stores the result in the dp dictionary for the k-tuple and target sum.

The function returns the list of solutions for the k-tuple and target sum, which is stored in the dp dictionary at the (k, target) key.

The time complexity of the iterative solution is O(kn^2t), where n is the length of the input array and t is the target sum, due to the nested loops and the use of the dp dictionary. The space complexity of the solution is also O(kn^2t) due to the dp dictionary.

## Key Takeaways

* **Brute force** is a common technique used to solve problems involving subproblems. It involves solving the problem by trying all possible combinations of the input values. This technique is usually the first approach to solving a problem, but it is not always the most efficient approach. The time complexity of the brute force approach is usually O(n^k), where n is the length of the input array and k is the size of the subproblem. The space complexity of the brute force approach is usually O(n^k), where n is the length of the input array and k is the size of the subproblem.

*  **Two-pointer technique** is a common technique used to solve problems involving sorted arrays. It involves using two pointers, one at the beginning of the array and one at the end of the array, and moving the pointers towards each other until the sum of the values at the two pointers equals the target.
*  **Hashing** is a common technique used to solve problems involving unsorted arrays. It involves using a hash table to store the values in the array and their indices. When a value is encountered, its complement is calculated and checked against the hash table to see if it exists. If it does, the indices of the two values are returned.
*  **Sorting** is a common technique used to solve problems involving sorted arrays. It involves sorting the array and then using the two-pointer technique to solve the problem.
*  **Recursion** is a common technique used to solve problems involving subproblems. It involves breaking down the problem into smaller subproblems and solving the subproblems recursively. The base case of the recursion is the smallest subproblem that can be solved without further recursion.
*  **Dynamic programming** is a common technique used to solve problems involving subproblems. It involves breaking down the problem into smaller subproblems and solving the subproblems iteratively. The base case of the recursion is the smallest subproblem that can be solved without further recursion.


## References

* https://people.csail.mit.edu/virgi/6.s078/lecture9.pdf
* https://cs.stackexchange.com/questions/2973/generalised-3sum-k-sum-problem

## Different Problem - Subset Sum problem - for another time

* https://en.wikipedia.org/wiki/Subset_sum_problem
 