<h3>Large Digit Fun!</h3>
<p>The following is part of a "make work" project I created for myself after completing <a href="https://www.udemy.com/course/complete-python-bootcamp/">The Complete Python Bootcamp From Zero to Hero in Python</a> and half of <a href="https://www.udemy.com/course/python-for-machine-learning-data-science-masterclass/">Python for Machine Learning & Data Science Masterclass</a>. As a break from lectures/tutorials, and something fun to sink my teeth into, I attempted to see how difficult it would be to describe very large numbers using formulas with small inputs. Turns out to be quite difficult! Argubly impossible due to <a href="https://en.wikipedia.org/wiki/Kolmogorov_complexity#Compression">Kolmogorov complexity</a>. The failed (and now embarassing) task, however, taught me a lot about many computer science, Pandas Series/Dataframes, and number theory concepts which I knew it would going in so it did not matter if I later found out it was considered impossible. I hoped to create a "lucky compression" technique because in theory if you managed to find even just one very long-digit integer (which a string of bits can create) capable of being described by a formula with small inputs (which a string of hopefully much less bits can create) then you could compress any file at least slightly. Then, since this technique would be lossless, perhaps you could find another within that compressed file, and so on. Perhaps a random-seed to jumble bits/bytes order could also be utilized to apply some additional entropy from file to file. Regardless, I do not have the computing power to really test if the concept would ever work and theory essentially says it shouldn't.</p>
<p>To entertain myself, I gamified it a bit to make the actual search more interesting. Ultimately this search was a failure and created a lot of messy output. Therefore, for this Notebook, I removed all the output and did not include the hundreds of .parquet files I generated. The output below is just a small example that generated 80 parquet files filled with function data to iteratively search through. When I had a lot more data and played around with a random .zip of an image (hoid.zip) I rarely ever found strings of numbers that could be "compressed" in this way - I couldn't reduce the number of digits enough to make up for the data that was required to describe the reduction.</p>

In [1]:
import math
import random
import numpy as np
import pandas as pd
import struct
from mpmath import mp
import re
import os.path

In [2]:
#Core functions
def get_digits(num: int) -> int:
    '''Returns the number of digits of a given integer using math.'''
    num = abs(num) #Force positive
    if num == 0: #catch the log10(zero) error
        return 0
    return 1 + math.floor(math.log10(num))
def random_alter(seed_value: int, digit_count: int):
    '''Returns a random number between which has 1 less digits than what is provided.
    Intended for altering a function result with a random seed.'''
    random.seed(seed_value, version = 2)
    return random.randint(10**(digit_count-2),10**(digit_count-1))
def max_digit_match(number_str: str, min_count: int):
    #Returns (char, start_index, end_index) for the largest digit-in-a-row which appears more than min_count
    search_digits = '0123456789'
    find = False
    found = []
    for char in search_digits:
        if re.search(char*min_count, number_str) != None:
            find = True
            found.append(char)
    if find:
        results = []
        for char in found:
            most_in_row = max(len(x) for x in re.findall(r'[%s]+' % char, number_str))
            for match in re.finditer(char*most_in_row, number_str):
                results.append((char, match.span()[1] - match.span()[0], match.span()))
        return results
    else: return None
def lin_gen_match(num: int, coef_limit:int=256, digit_reduction_min:int = 16):
    digit_count = get_digits(num)
    digit_limit = digit_count//2 - digit_reduction_min//2 #15 digit = 6 bytes.
    results = []
    mp.dps = digit_count + 10
    no_find = True
    for c1 in range(1,coef_limit+1):
        for c2 in range(1,c1+1):
            #work a basic linear generator backwords
            under_root = c1*c1 + 4*c2
            limit = mp.fdiv((c1 + mp.sqrt(under_root)),(2)) #Last divided by previous term approaches limit

            #Number only has a smallish solution if when it is divided by limit
            #   with a precision equal to its digit-count + 10, that result is an integer
            #   i.e. if this isn't an integer, don't bother trying to work it backwards.
            mpf_num = mp.fdiv(num,limit)
            if str(mpf_num)[-1] != '0': #expensive to convert to string. don't know better way
                continue
            a = int(mp.nint(mpf_num))
            b = num
            #Work the sequence backwards and stop when it goes negative or...
            #   the next number found + previous is larger than the previous one (meaning we crossed the 0-line)
            ctr = 0
            while True:
                prev_num, remainder = divmod(b - c1*a, c2)
                if prev_num + b >= a + b or prev_num <= 0 or remainder != 0:
                    ctr += 1
                    break
                a, b = prev_num, a
                ctr += 1
            if get_digits(a) < digit_limit or get_digits(b) < digit_limit:
                results.append((a, b, ctr, c1, c2))
                no_find = False
    if no_find:
        return None
    return results
def power_check(num: int, min_d: int, a_range = (1, 1024), b_range = (2, 1025), n_min = 1):
    '''Finds the largest AB^N which provided num evenly divides by.
    Retruns a list of all solutions which meet the provided options.
    
    The options are:
        min_d: Minimum number of digits the division of ab^n must remove.
        a_range: Total range of possible a-values.
        b_range: Total range of possible b-values.
        n_min: Minimum n-value to return
    
    Details:
    Function works by essentially scanning the modulus of A within provided range,
    and then if a result is zero it takes the division result and moves on to scanning
    the result with modulus B within provided range. Finally, if mod B is zero it sees
    how many increasing multiples of B it can evenly divide by which ends up as N.'''
    result = []
    no_find = True
    num_digits = get_digits(num)
    for a in range(a_range[0], a_range[1] + 1):
        ans, rmd = divmod(num, a)
        if rmd == 0:
            for b in range(b_range[0], b_range[1] + 1):
                n_count = 0
                t = b
                while ans % t == 0:
                    n_count += 1
                    t = t*b
                if n_count:
                    calc = a*b**n_count
                    digit_reduction = num_digits - get_digits(num // calc)
                    if num - calc == 0:
                        return [(a, b, n_count, get_digits(num))] #if an exact match is found, return it
                    elif digit_reduction >= min_d and n_count >= n_min:
                        no_find = False
                        result.append((a, b, n_count, digit_reduction))
                        
    if no_find:
        return None
    return result

In [26]:
class GiantOak:
    '''Big tree; many squirrel nests and nuts.
    Note: nest size of (FirstDigits = 5, Last_digits = 5) is 9,999,999,999, or 10billion max rows'''
    def __init__(self, data_directory: str, age: int, nest_size: tuple = (5,5)):
        #age = random seed value for library
        self.age = age
        # (term storage datatype, lower digits, upper digits)
        self._storage = ('uint16', 21, 80)
        self.append_parquets = False
        if os.path.isdir(data_directory) == False:
            print(f'Did not find directory: {data_directory}. Dead Oak, try again.')
        else:
            for file in os.listdir(data_directory):
                if re.match(r'\d{,}_'+str(self.age)+'.parquet', file):
                    print(f'Found {file} in {data_directory}; assume existing.\nOak is now linked to existing nests with nest size {nest_size}')
                    self.append_parquets = True
                    break
            if self.append_parquets == False:
                print(f'Found {data_directory} to be void of nests.\nThis is an empty Oak ready for nesting with nest size {nest_size}.')
                
        self.nests = {}
        self.nests_open = set()
        self.dir = data_directory
        self.nest_size = nest_size #(FirstDigits, LastDigits)
    def get_storage(self):
        return self._storage
    def open_nests(self, data_to_load: list, sparse: bool = True, columns: list = None):
        '''Load existing parquet file made from DataPool()'''
        #Add in a check of index name being same as the class object.
        #    if it is, change the Class Objects and Print a warning
        index_check = False
        for digits_num in data_to_load:
            if digits_num in self.nests_open:
                print(f'{digits_num} was found to be already open. Skipped opening it; close the open one first.')
                continue
            try:
                self.nests[digits_num] = pd.read_parquet(f'{self.dir}\\{digits_num}_{self.age}.parquet', engine = 'pyarrow')
            except FileNotFoundError:
                print(f'Nest {digits_num}_{self.age}.parquet not found. Aborting.')
                return False #use for checking if it worked.
            self.nests_open.add(digits_num)
            if index_check:
                multi_index_names = self.nests[digits_num].index.names
                if multi_index_names == [f'F{self.nest_size[0]}', f'L{self.nest_size[1]}']:
                    index_check = False
                else:
                    print(f"Found loaded index nest_size {multi_index_names} to differ from this Oak's nest_size of {nest_size}.")
                    self.nest_size = (int(temp_df.index.names[0][1:]), int(temp_df.index.names[1][1:]))
                    print(f'This Oak has now been changed to {nest_size}.')
                    index_check = False
                    
            #Convert the opened DataFrame to sparse to save memory
            if sparse:
                self.nests[digits_num] = self.nests[digits_num].astype(pd.SparseDtype("object", np.nan))
        return True #means all worked
            
    def close_nests(self, data_to_close: list):
        for digits_num in data_to_close:
            if digits_num not in self.nests_open:
                print(f'Did not find {digits_num} to be open.')
                continue
            self.nests[digits_num] = ''
            del self.nests[digits_num]
            self.nests_open.remove(digits_num)
    def close_all_nests(self):
        list_of_open_nests = list(self.nests_open)
        for digits_num in list_of_open_nests:
            self.nests[digits_num] = ''
            del self.nests[digits_num]
            self.nests_open.remove(digits_num)

In [4]:
class MasterOwl:
    def __init__(self, oak: object, name: str):
        #Variable n is always the largest range variable and is to be the last in the list.
        self._tree = oak
        self._name = name
        
    def F_L_index(self, num: int) -> (str, int):
        '''Returns a tuple of the first and last digits for multi-indexing.
        The amount of digits depends on the Oak's nest size.'''
        original_digits = get_digits(num)
        num_start_digits = self._tree.nest_size[0]
        num_end_digits = self._tree.nest_size[1]
        if original_digits <= num_start_digits:
            return ((num, 0), original_digits)
        first_digits = num // 10**(original_digits-num_start_digits)
        last_digits = num % 10**num_end_digits #Won't include 0's before

        return ((first_digits, last_digits), original_digits)
        
    def command_squirrels(self, squirrel: object) -> dict:
        '''
        '''
        if self._tree.nest_size[0] < 1:
            print('Number of digits for Categorize must be greater than 0 to prevent issues with the 0-index.\nAborting function.')
            return None
        #Grab current level data
        name, stats = squirrel.deploy()
        func_type, inputs, bits = stats
        inputs_dtype, lower_digits, upper_digits = self._tree.get_storage()

        #Setup/Gather initial info for iterations.
        num_inputs = len(inputs)
        total_iterations = 1
        acorns_added = 0 #track number of inputs within digit range that get added for feeback
        iterations = 0 #iteration ctr -> was easier to work with for updates
        function_inputs = []
        zero_index_arr = []
        for ctr, input_set in enumerate(inputs):
            min_i, max_i = input_set
            value_range = max_i + 1 - min_i #get what would be range(a_min,a_max + 1)
            total_iterations = total_iterations*value_range #get what would be total nested loops
            function_inputs.append(min_i) #Save initial values for iterating
            zero_index_arr.append(min_i) #add min to zero-index info
            zero_index_arr.append(max_i) #add max to zero-index info
            zero_index_arr.append(bits[ctr]) #add bit for above min/max
            
        #Setup 0-index location to be min/max & bit info which will be first category item for each Digit-key
        #    Min/Max/Bit goes: [min, max, bit, min, max, bit, min, max, bit] etc, per # of inputs
        zero_index = np.array(zero_index_arr, dtype=inputs_dtype)
        dict_output = {}
        
        #Set last range as the main looper which will always be largest
        loop_range = inputs[num_inputs-1]
        loop_iterations = loop_range[1] + 1 - loop_range[0] #inclusive total
        update_freq = 10 # % step to get notified
        update_notify = total_iterations // int(100/update_freq) #setup to notify user at update_freq%
        notify = 1
        print(f'{self._name} has directed {name.capitalize()}\'s troops to search and find acorns of rank {lower_digits} to {upper_digits}.',
              f'\n{name.capitalize()} sent {total_iterations} squirrels out on the mission.\n')
        while total_iterations > iterations: #stop when outta iterations
            if iterations >= update_notify*notify:
                print(f'{notify*update_freq}% squirrels have returned; {iterations-acorns_added} total have failed.\nThere are {total_iterations-iterations} squirrels left searching.')
                notify+=1
            if func_type == 'Simple':
                for n in range(loop_iterations): #main looper
                    function_call = getattr(squirrel, 'func_' + name) #getfunction by name (string) Save as object to call
                    result = function_call(function_inputs)
                    index_category, num_digits  = self.F_L_index(result)
                    
                    #Only record values between set digit amount
                    if num_digits < lower_digits or num_digits > upper_digits:
                        function_inputs[num_inputs-1] += 1
                        iterations += 1
                        continue

                    #Add results to Dictionary of Digits which are each a Dictionary of Category/Inputs
                    add_inputs = np.array(function_inputs, dtype=inputs_dtype)
                    if num_digits in dict_output:
                        if index_category in dict_output[num_digits]:
                            dict_output[num_digits][index_category] = np.append(dict_output[num_digits][index_category], add_inputs)
                        else:
                            dict_output[num_digits][index_category] = add_inputs
                    else:
                        dict_output[num_digits] = {(0, 0) : zero_index, index_category : add_inputs}
                    acorns_added +=1
                    #increment the last term which is our main looper & increase # of iterations
                    function_inputs[num_inputs-1] += 1
                    iterations += 1
            elif func_type == 'Generator':
                #since generator functions work by creating every number on the way up to result n,
                #   we can save a lot of computer by switching the way we iterate/save for them
                function_call = getattr(squirrel, 'func_' + name) #getfunction by name (string) Save as object to call
                n_results_list = function_call(function_inputs, False, loop_range) #returns a list of loop_range
                for ctr, result in enumerate(n_results_list):
                    index_category, num_digits  = self.F_L_index(result)
                    #Only record values between set digit amount
                    if num_digits < lower_digits or num_digits > upper_digits:
                        continue
                    
                    #Add results to Dictionary of Digits which are each a Dictionary of Category/Inputs
                    add_inputs = np.array(function_inputs[:-1]+[(loop_range[0]+ctr)], dtype=inputs_dtype)
                    if num_digits in dict_output:
                        if index_category in dict_output[num_digits]:
                            dict_output[num_digits][index_category] = np.append(dict_output[num_digits][index_category], add_inputs)
                        else:
                            dict_output[num_digits][index_category] = add_inputs
                    else:
                        dict_output[num_digits] = {(0, 0) : zero_index, index_category : add_inputs}
                    acorns_added +=1
                iterations += loop_iterations
                    
            else:
                print('Missing identifiable function type name (i.e. Simple, Generator, etc).\nAborting.')
                return None
            #Reset looping term after full loop completed
            function_inputs[num_inputs-1] = inputs[num_inputs-1][0] 
            #Replicate how terms increase in nested loops, but we'll go left-to-right minus the looper
            #   rather than how nested intuatively feels right-to-left.
            if num_digits > 2:
                for ctr, i in enumerate(function_inputs[0:-1]):
                    if i != inputs[ctr][1]:
                        function_inputs[ctr] +=1
                        break
                    else:
                        function_inputs[ctr] = inputs[ctr][0]
            else:
                function_inputs[0] += 1
        print(f'\nMissions complete!\nOut of {name.capitalize()}\'s {total_iterations} squirrels sent out',
              f'only {acorns_added} squirrels returned with acorns worthy of ranks {lower_digits} to',
              f'{upper_digits} ({round((acorns_added/total_iterations)*100,2)}%).\n')
        return dict_output
    def add_acorns(self, squirrel: object) -> pd.core.frame.DataFrame:
        '''
        Takes the dict_output from self.gather_acorns and adds it to the Oak Tree object.
        Adds to existing partquet files in the Oak's directory or creates new
        if the first of the set isn't already in the directory.
        '''
        self._tree.close_all_nests() #close any open nests just in case
        #Call gather_acorns() to get raw dictionary data
        generated_data = self.command_squirrels(squirrel)
        column_name = squirrel.deploy()[0]
        print('Storing acorns...')
        #Create the dataframes by digit-key and save as parquet 0_1.
        #    _1 is to maybe increment later.
        for digit_key in generated_data:
            data_reshape = np.array(list(generated_data[digit_key].values()), dtype = 'object').reshape(-1,1) #reshape [1,x] to [x,1]
            multi_index_names = [f'F{self._tree.nest_size[0]}', f'L{self._tree.nest_size[1]}'] #set index names to have nest_size
            multi_index = pd.MultiIndex.from_tuples(generated_data[digit_key].keys(), names=multi_index_names)
            temp_df = pd.DataFrame(data_reshape, columns=[column_name], index = multi_index)
            if self._tree.append_parquets:
                #try statement catches if the file didn't exist for some reason
                try:
                    temp_df = pd.concat([pd.read_parquet(f'{self._tree.dir}\\{digit_key}_{self._tree.age}.parquet', engine = 'pyarrow'), temp_df], axis = 1)
                except:
                    pass
            temp_df.sort_index(axis = 0, ascending = True, inplace = True)
            temp_df.to_parquet(f'{self._tree.dir}\\{digit_key}_{self._tree.age}.parquet', engine = 'pyarrow')
        
        #Now we know data is there; set append to true
        self._tree.append_parquets = True
        print(f'{column_name}\'s acorns have been successfully stored in the Oak\'s many nests.')
    def remove_acorns(self, squirrel: object):
        column_name = squirrel.sq_leader
        self._tree.close_all_nests() #close any open nests
        #Remove a column from each data set.
        if len(self._tree.nests) > 0:
            print('Tree has nests currently open. Close them before proceeding.')
            return None
        input_dtype, lower_digits, upper_digits = self._tree.get_storage()
        for i in range(lower_digits, upper_digits + 1):
            self._tree.open_nests([i], False) # open df and don't convert to sparse
            try:
                self._tree.nests[i].drop([column_name], axis = 1, inplace = True)
                self._tree.nests[i].dropna(how='all', inplace = True) #Removes full NaN rows
            except: print(f'Nest {i} did not have column {column_name}')
            try: self._tree.nests[i].to_parquet(self._tree.dir + '\\' + str(i) + '_1' + '.parquet', engine = 'pyarrow')
            except: print(f'Was unable to save nest {i} after removing {column_name}.')
            self._tree.close_nests([i])
        print(f'{column_name}\'s acorns have been removed from the Oak.')
        

In [5]:
class SquirrelBrigade:
    def __init__(self, oak: object, sq_leader: str=''):
        self._tree = oak
        self.sq_leader = sq_leader
        #Library of functions, I mean squirrels, to find nuts, I mean data:
        #age: {Func_code/name : ([Tuples of min/max a, b, c,...n], #inputs, [bits per input a, b, c,...n])}
        self._squirrel_leaders = { #Max int size is 65535 for library uint16
            'chip' : ('Simple', [(44, 60), (3, 19), (1, 257)],[4,4,8]), #10/1000 --> 100%
            'dale' : ('Simple', [(2, 65), (3, 1026)], [6,10]), #10/1000 --> 69%
            'fred' : ('Simple', [(2, 258), (2, 258), (150, 406)], [8,8,8]), 
            'sam' : ('Simple', [(2, 130), (2, 130), (3, 1026)], [7,7,10]), #10/1000 --> 59%
            'matt' : ('Generator', [(1, 33), (1, 65), (1, 65), (420, 548)], [5,6,6,7]), #10/1000 --> 100%
            'steve' : (),
            'stephen' : (),
            'dave' : ()}
    def deploy(self):
        return (self.sq_leader, self._squirrel_leaders[self.sq_leader])
    def list_names(self):
        arr  = []
        for sq in self._squirrel_leaders:
            arr.append(sq)
        return arr
    def sq_info(self, name: str = '') -> str:
        if name == '' and self._squirrel != '': #Means return current
            function_call = getattr(self, 'func_' + self._squirrel)
        elif name == '':
            print('No squirrel set or given. Try providing a name.')
            return None
        else: #Return named
            function_call = getattr(self, 'func_' + name)
        print(function_call.__doc__)
    def call(self, list_of_terms: list, sq_leader: str = '')-> list:
        '''Return a list of function outputs from the provided name & list_of_terms.
        list_of_terms length must be a multiple of the named function's term-count'''
        if sq_leader == '': #if no name provided, assume current.
            sq_leader = self.sq_leader
        terms_per = len(self._squirrel_leaders[sq_leader][1]) #counts term-tuples
        total_terms = len(list_of_terms)
        function_call = getattr(self, 'func_' + sq_leader)
        outputs = []
        for i in range(total_terms//terms_per):
            function_inputs = list_of_terms[i*terms_per:i*terms_per+terms_per]
            result = function_call(function_inputs)
            outputs.append((function_inputs,result))
        return outputs
    ##Functions##
    #Every function's actual result will be altered using random seed.
    def func_chip(self, terms: [int, int, int])-> int:
        '''Simple [n, b, a]: ab^n'''
        n, b, a = terms
        initial_result = a*b**n
        random_mod = random_alter(self._tree.age,get_digits(initial_result))
        return initial_result-random_mod
    def func_dale(self, terms: [int, int])-> int:
        ''''''
    def func_fred(self, terms: [int, int, int])-> int:
        ''''''
    def func_sam(self, terms: [int, int, int])-> int:
        '''Simple [a, b, n]: ab^(n) - ab^(n-1) '''
        return terms[0]*terms[1]**terms[2] - terms[0]*terms[1]**(terms[2]-1)
    def func_matt(self, terms: [int, int, int, int], basic=True, n_return_range: tuple = None)-> int:
        '''Generator [c1,(Sn-2),(Sn-1),n]
        Where:
              c = c1
              a = Sn-2
              b = Sn-1
              n = N
        if a > b: b = 2*(b + a)   <-- skips starting term issues
        Sn = abs(c*a - 2*c*b)'''
        c, a, b, n = terms
        if a > b: b = 2*(b + a)
        if n < 1:
            return a
        if basic:
            for i in range(1,n):
                a, b = b, c*a - 2*c*b
            return abs(b)
        else:
            #return a list within a given range if it's not Basic
            output_list = []
            n_min, n_max = n_return_range
            for i in range(1,n_max):
                a, b = b, c*a - 2*c*b
                if i+1 >= n_min:
                    output_list.append(abs(b))
            return output_list

In [6]:
class EvilArborist:
    def __init__(self, directory: str = '', filename: str = '', test_num: int = 0):
        self.dir = directory
        self.filename = filename
        self.solved_bytes = []
        self.current_byte_location = 0
        self.factors = []
        self.last_factor_try = 1
        self.list_of_operations = [(None, None)]
        if test_num == 0:
            self.number = 0
            self.number_str = ''
            self.digit_count = 0
        else:
            self.number = test_num
            self.number_str = str(test_num)
            self.digit_count = len(self.number_str)
        print('Oh noes! An evil arborist is coming to cut down the Great Oak!')
    def wield_axe(self, byte_amount: int, endian: str = 'big'):
        #Load in byte_amount of bytes from file and convert to int and string of the int
        self.list_of_operations[0] = (f'{endian}_endian', byte_amount) #set first operation
        f = open(self.dir + self.filename, 'rb') #read as binary
        f.seek(self.current_byte_location, 0)    # move the file pointer forward. 0 = reference point is the start
        data = f.read(byte_amount)
        f.close()
        self.number = int.from_bytes(data, byteorder=endian, signed=False)
        self.number_str = str(self.number)
        self.digit_count = len(self.number_str) #faster than get_digits() if already a string
        if endian == 'big':
            print(f'The arborist has equipped a BIG {self.digit_count}kg double-axe.')
        else: print(f'The arborist has equipped a LITTLE double-axe capable of {self.digit_count}km/hr swings.')
        digit_match = max_digit_match(self.number_str, 8)
        if digit_match != None:
            print(f'Wow, looks like his axe has a weak spot:')
            print(digit_match)
    def sharpen_axe(self, wield: bool, sharpness: int = 25, passes: int = 100000000):
        find = True
        num_factors = 0
        factors = []
        factor = self.last_factor_try
        if factor == 1: #If it's equal to 1, skip ahead one to avoid double-upping on 1
            factor += 1
        print(f'The arborist has decided to sit and sharpen a side of his axe.')
        while find:
            find = False
            for i in range(1, passes+1):
                if self.number % (i + factor) == 0:
                    factors.append(factor)
                    num_factors += 1
                    factor = factor + i
                    while self.number % (10*factor) == 0:
                        factor = factor*10
                    while self.number % (2*factor) == 0:
                        factors.append(factor)
                        num_factors += 1
                        factor = factor*2
                    find = True
                    if num_factors > sharpness:
                        factors = factors[num_factors-sharpness:num_factors] #cut anything before sharpness
                        num_factors = sharpness
            if find: print(f'The arborist swaps to a new sharpening stone, his last one managed {factor} sharpen passes.')
            elif factors == []: print(f'The arborist has finished sharpening and his axe did not get sharper.')
            else: print(f'The arborist has finished sharpening, he looks impressed with his last stone\'s {factors[-1]} sharpen passes.')
        if wield == True:
            self.list_of_operations.append(('DivideBy', factor))
            self.number = self.number // factor
            self.number_str = str(test_num)
            self.digit_count = len(self.number_str)
            print(f'The arborist has gotten up and flipped his axe to chop with it\'s newly sharpened edge and +{self.digit_count} slash stat.')
        else:
            print(f'The arborist has gotten gotten up and decided to keep using the duller side of his axe. For now...')
            self.factors = self.factors + factors
            self.last_factor_try = factor + passes

In [38]:
class Birds:
    def __init__(self, tree: object, arborist: object, squirrel: object):
        self._tree = tree
        self._threat = arborist
        self._squirrel = squirrel
    def acorn_shot(self, num_str: str = None, digit_count: int = None):
        #Check for an exact digit-match
        if num_str == None or digit_count == None:
            num_str = self._threat.number_str
            digit_count = self._threat.digit_count
        first_digits = int(num_str[0:self._tree.nest_size[0]])
        last_digits = int(num_str[-1*self._tree.nest_size[1]:])
        if digit_count not in self._tree.nests_open:
            self._tree.open_nests([digit_count])
        try:
            found_items = self._tree.nests[digit_count].loc[first_digits].loc[last_digits].dropna()
            found_indicies = found_items.index
        except: return None
        num_int = int(num_str) #convert to int to make if-checks quicker
        for index in found_indicies:
            terms_per = len(self._squirrel._squirrel_leaders[self._tree.level][index][1]) #need a better way
            terms = found_items[index]
            function_call = getattr(self._squirrel, 'func_' + index)
            for i in range(int(len(terms)/terms_per)):
                function_inputs = terms[i*terms_per:i*terms_per+terms_per]
                result = function_call(function_inputs.tolist()) #numpy 'unit16' arrays needs to be python int
                if result == num_int:
                    print('Jackpot!')
                    return (index, function_inputs)
        return None
    def acorn_barrage(self, lin_gen: bool = False):
        #This function has so much room for efficiency improvements
        limit = 21 #digits
        enemy = self._threat.number_str
        enemy_digits = self._threat.digit_count
        results = []
        total_seeks = 0
        for i in range(1, enemy_digits+1-limit): #There is def a better way. it's sum of simple series.
            total_seeks+=i
        update_freq = 20 #% to get notified
        update_notify = total_seeks // int(100/update_freq) #setup to notify user at update_freq%
        notify = 1
        seek_count = 0
        print(f'Sending a barrage of {total_seeks} acorns')
        for digit_chunk in range(enemy_digits, limit - 1, -1):
            seeks = enemy_digits-digit_chunk
            if seek_count >= update_notify*notify:
                print(f'{notify*update_freq}% of acorn ammo used.')
                notify +=1
            while seeks >= 0:
                seek_str = enemy[seeks:seeks+digit_chunk]
                seeks -= 1
                seek_count += 1
                if lin_gen:
                    shot = lin_gen_match(int(seek_str),25)
                else:
                    shot = self.acorn_shot(seek_str, digit_chunk)
                if shot != None:
                    results.append((seek_str, shot))
                    print(f'Scored: {(seek_str, shot)}')
        return results
    def peck(self, min_digit_match_count: int, typ: str = '-', depth: int = 1):
        '''Searches the dataframe with same number of digits, and then for each function,
        it computes the result and subtracts it from the main number.
        Then it runs a match_digits check with the minimum input'''
        enemy = self._threat.number
        enemy_digits = self._threat.digit_count
        largest_digit_diff = 0
        best_digit_diff = None
        matched_digits_output = {}
        for peck in range(depth):
            print(f'Diving for peck #{peck+1}')
            enemy_digits = enemy_digits - peck #(first is zero)
            matched_digits_output[enemy_digits] = None
            if enemy_digits not in self._tree.nests_open:
                self._tree.open_nests([enemy_digits])
            for column in self._tree.nests[enemy_digits]:
                for terms in self._tree.nests[enemy_digits][column].dropna()[1:]:#skip zero-index info
                    results = self._squirrel.call(terms.tolist(), column)#have to convert nparray to python int-list
                    for result in results:
                        #Reminder: result = (inputs, calculation)
                        if typ == '+':
                            calc = enemy+result[1]
                        else:
                            calc = abs(enemy-result[1])
                        if enemy_digits == self._threat.digit_count:
                            digit_diff = (enemy_digits-get_digits(calc))
                            if digit_diff > largest_digit_diff:
                                largest_digit_diff = digit_diff
                                best_digit_diff = (column, result[0])
                        matched_digits = max_digit_match(str(calc), min_digit_match_count)
                        if matched_digits != None:
                            if matched_digits_output[enemy_digits] == None:
                                matched_digits_output[enemy_digits] = [(column, result[0], matched_digits)]
                            else:
                                matched_digits_output[enemy_digits].append((column, result[0], matched_digits))
        print(f'Best digit reduction was {largest_digit_diff} digits with: {best_digit_diff}')
        print(f'With a min match of {min_digit_match_count} and doing {typ}:')
        return matched_digits_output
    def gust(self, min_digit_match_count: int, power: int = 1):
        num_factors = len(self._threat.factors)
        matched_digits_output = {}
        if power > num_factors:
            power = num_factors
        for factor in self._threat.factors[num_factors-power:num_factors]:
            new_num_str = str(self._threat.number//factor)
            digit_match = max_digit_match(new_num_str, min_digit_match_count)
            if digit_match != 0:
                matched_digits_output[factor] = digit_match
        return matched_digits_output

In [27]:
tree = GiantOak('data\\age_0', 0)

Found data\age_0 to be void of nests.
This is an empty Oak ready for nesting with nest size (5, 5).


In [28]:
squirrel = SquirrelBrigade(tree,'chip')

In [29]:
owl = MasterOwl(tree, 'Hooty')

In [30]:
owl.add_acorns(squirrel)

Hooty has directed Chip's troops to search and find acorns of rank 21 to 80. 
Chip sent 74273 squirrels out on the mission.

10% squirrels have returned; 0 total have failed.
There are 66820 squirrels left searching.
20% squirrels have returned; 0 total have failed.
There are 59367 squirrels left searching.
30% squirrels have returned; 0 total have failed.
There are 51914 squirrels left searching.
40% squirrels have returned; 0 total have failed.
There are 44461 squirrels left searching.
50% squirrels have returned; 0 total have failed.
There are 37008 squirrels left searching.
60% squirrels have returned; 0 total have failed.
There are 29555 squirrels left searching.
70% squirrels have returned; 0 total have failed.
There are 22102 squirrels left searching.
80% squirrels have returned; 0 total have failed.
There are 14649 squirrels left searching.
90% squirrels have returned; 0 total have failed.
There are 7196 squirrels left searching.

Missions complete!
Out of Chip's 74273 squirrel

In [31]:
evil = EvilArborist('', 'hoid.zip')

Oh noes! An evil arborist is coming to cut down the Great Oak!


In [32]:
evil.wield_axe(19, 'big')

The arborist has equipped a BIG 46kg double-axe.


In [39]:
bird = Birds(tree, evil, squirrel)

In [40]:
bird.peck(8, '+', 1)

Diving for peck #1
Best digit reduction was 0 digits with: None
With a min match of 8 and doing +:


{46: None}

In [41]:
bird.peck(8)

Diving for peck #1
Best digit reduction was 3 digits with: ('chip', [56, 6, 66])
With a min match of 8 and doing -:


{46: None}

In [42]:
bird.peck(6)

Diving for peck #1
Best digit reduction was 3 digits with: ('chip', [56, 6, 66])
With a min match of 6 and doing -:


{46: [('chip', [48, 8, 128], [('2', 6, (6, 12))]),
  ('chip', [49, 8, 16], [('2', 6, (6, 12))]),
  ('chip', [50, 8, 2], [('2', 6, (6, 12))])]}

In [43]:
bird.acorn_barrage()

Sending a barrage of 325 acorns
20% of acorn ammo used.
40% of acorn ammo used.
60% of acorn ammo used.
80% of acorn ammo used.
100% of acorn ammo used.


[]

In [44]:
bird.gust(8)

{}