# Minimal Test

## Task 1

Write a class that has a method to resolve the MIME type of a file based on the extension of its file name. You shall not use predefined functions (such as getExtension) to process the file name or extract the MIME type.

### Remarks

Detecting a mime type from a filename extension is generally a bad idea. One may rename a filename and omit the extension alltogether, or mistakenly place a wrong extension, e.g. name .zip a RAR archive. Or a file may not even have such metadata and be represented simply by a stream of bytes.

In real world a code should determine a mime type by reading actual file contents. One may easily achive using a linux file utility:

```file --mime-type image1.png```

### Solution

A very basic solution using a hashmap. Since there is only a handful of mime-types, I would not be surprised, if storing mime types in a sorted array and searching using a binary search would result in a more performant solution.

Time complexity:

* get_file_extension() => O(n), n - length of a *path*
* resolve_extension()  => O(~1), but we are dependant on a complexity of a hash calculation.

Overall: O(n)

In [10]:
class MimeType:
    def __init__(self, extension, mime_type):
        self.extension = str(extension).lower()
        self.mime_type = str(mime_type).lower()
    
    def __str__(self):
        return "['%s' : '%s']" % (self.extension, self.mime_type) 

class MimeTypeResolver:
    DEFAULT_MIME = MimeType("", "application/octet-stream")
    
    def __init__(self, mime_types=[]):
        types = {}
        for mime in mime_types:
            types[mime.extension] = mime
        self.mime_types = types
        
    def resolve_extension(self, extension):
        e = str(extension)
        if len(e) == 0 or not e in self.mime_types:
            return MimeTypeResolver.DEFAULT_MIME
        return self.mime_types[e]

# Gets an extension of a path. Assumes paths uses unix separators.
def get_file_extension(path):
    p = str(path).lower()
    # We assume that a filename must consist of <filename>.<extension>,
    # and that both filename and extension must be of at least 1 char length.
    if len(p) < 3:
        return ""
    i = p.rfind('.')
    if i == -1 or i == len(p)-1:
        return ""
    j = p.rfind('/')
    if j == -1 or j < i:
        return p[i + 1:]
    return ""

resolver = MimeTypeResolver([
    MimeType("png", "image/png"),
    MimeType("htm", "text/html"),
    MimeType("html", "text/html"),
])

print(resolver.resolve_extension(get_file_extension("image1.png")))
print(resolver.resolve_extension(get_file_extension("/www/index.html")))
print(resolver.resolve_extension(get_file_extension("/home/video.mp4")))
print(resolver.resolve_extension(get_file_extension("/home.mp4/extensionless")))

['png' : 'image/png']
['html' : 'text/html']
['' : 'application/octet-stream']
['' : 'application/octet-stream']


## Task 2

Write method to remove comments from a text file. Comments start with /* and end with */. You must ignore any other comment types. getNextLine() is provided and allows you to obtain the next line from the file, until the end is reached, in which case the method returns null. Output the result on stdout.

## Solution

A basic state machine.

In [9]:
class CodePurifier:
    CODE = 1
    MAYBE_COMMENT = 2
    COMMENT = 3
    MAYBE_CODE = 4
    
    def __init__(self, stream):
        self.stream = stream
        self.state = CodePurifier.CODE

    def parse_char(self, char):
        c = str(char)
        if len(c) != 1:
            raise ValueError("Expected character, received '%s'" % c)
            
        if self.state == CodePurifier.CODE:
            if c == '/':
                self.state = CodePurifier.MAYBE_COMMENT
            else:
                self.stream.write(c)
                
        elif self.state == CodePurifier.MAYBE_COMMENT:
            if c == '*':
                self.state = CodePurifier.COMMENT
            else:
                self.state = CodePurifier.CODE
                self.stream.write('/')
                self.stream.write(c)
                
        elif self.state == CodePurifier.COMMENT:
            if c == '*':
                self.state = CodePurifier.MAYBE_CODE
                
        elif self.state == CodePurifier.MAYBE_CODE:
            if c == '/':
                self.state = CodePurifier.CODE
            else:
                self.state = CodePurifier.COMMENT

                
    def parse_line(self, line):
        for c in str(line):
            self.parse_char(c)
            
code = \
"""
This is /*not */so cool.

Blah /* blah
blah blah
blah */blah.

Other comment // types

Some / trickery * */.

In-comment/* /* trickery / * */.
"""
from io import StringIO
buf = StringIO(code)

import sys
purifier = CodePurifier(sys.stdout)

for line in buf:
    purifier.parse_line(line)


This is so cool.

Blah blah.

Other comment // types

Some / trickery * */.

In-comment.
