## Problem: Allien Dictionary

https://leetcode.com/problems/alien-dictionary/

There is a new alien language which uses the latin alphabet. However, the order among letters are unknown to you. You are given a list of words from alien language dictionary, where strings in words are sorted lexicographically by the rules of this new language. Return string of the unique letters in the new alien language sorted in lexicographically increasing order by the rules of this new language. If there is no solution return "". If there are multiple solutions return any one.

A string s is lexicographically smaller than a string t if at the first letter where they differ, the letter in s comes before letter in t in alien language. If the first min(s.length, t.length) letters are same then s si smaller if and only if s.length < t.length

Example1:

    words = ["wrt", "wrf", "er", "ett", "rftt"] 
    output: wertf
    
Example2:

    words = ["ywx", "wz", "xww", "xz", "zyy", "zwz"]
    output = ywxz
    
Example3:

    words = ["ba", "bc", "ac", "cab"]
    output = bac


### Approach:

Since the given words are sorted lexicographically by the rules of the alien language, we can always compare two adjacent words to determine the ordering of the characters. Take Example 1 below: [“ba”, “bc”, “ac”, “cab”]

Take the first two words “ba” and “bc”. Starting from the beginning of the words, find the first character that is different in both words: it would be a from “ba” and c from “bc”. Because of the sorted order of words (i.e. the dictionary!), we can conclude that a comes before c in the alien language. Similarly, from “bc” and “ac”, we can conclude that b comes before a. These two points tell us that we are actually asked to find the topological ordering of the characters, and that the ordering rules should be inferred from adjacent words from the alien dictionary.

This makes the current problem similar to Tasks Scheduling Order, the only difference being that we need to build the graph of the characters by comparing adjacent words first, and then perform the topological sort for the graph to determine the order of the characters.

In [24]:
def alienDict(words):
    graph = {l:set() for word in words for l in word}
    for i in range(1, len(words)):
        w1 = words[i-1]
        w2 = words[i]
        minLen = min(len(w1), len(w2))
        if w1[:minLen] == w2[:minLen] and len(w1) > len(2):
            return ""
        for j in range(minLen):
            if w1[j] != w2[j]:
                graph[w1[j]].add(w2[j])
                break ## First mimatch letters will give the order
    res = topologicalSort(graph)
    return "".join(res)

def topologicalSort(graph):
    visited = {}
    stack = []
    for l in graph:
        if dfs(l, graph, visited, stack):
            return ""
    return list(reversed(stack))

def dfs(node, graph, visited, stack):
    if node in visited:
        return visited[node]
    visited[node] = True
    for c in graph[node]:
        if dfs(c, graph, visited, stack):
            return True
    stack.append(node)
    visited[node] = False
    return False
        

In [25]:
words = ["ba", "bc", "ac", "cab"]
alienDict(words)

'bac'

In [26]:
words = ["wrt", "wrf", "er", "ett", "rftt"] 
alienDict(words)

'wertf'

In [27]:
words = ["ywx", "wz", "xww", "xz", "zyy", "zwz"]
alienDict(words)

'ywxz'