# Behind this code

With only a few weeks left in the competition, it’s been an amazing journey so far. I’ve already learned so much from the incredible community and the variety of creative tasks.

Special thanks to @kenkrige, @kayjoking, @jacekwl, @lukeg10081, @jazivxt and @tonylica for sharing so many brilliant ideas and practical tips on golf-style coding!
To be honest, I’ve discovered plenty of compact Python tricks I’d never seen before — many of them from this awesome community and the Code Golf StackExchange https://codegolf.stackexchange.com/. These are definitely tools I’ll keep using in future challenges.

There are still plenty of hidden gems behind the top scores that I can’t wait to explore once the competition ends.
But since we still have a bit of time left, I wanted to share a few simplification ideas I’ve noted for myself along the way.

In this notebook, I’ll walk through two examples that show how tiny tweaks can make code both shorter and smarter.

In [None]:
import sys
sys.path.append("/kaggle/input/google-code-golf-2025/code_golf_utils")
from code_golf_utils import *

# My Approach

Even though LLMs aren’t yet as strong at golf coding as human experts, I’ve found that simplifying the problem and guiding the model with a clear logic description can make a big difference in code length and quality.

Moreover, trying alternative reasoning paths — instead of forcing the same logic repeatedly — often leads to much shorter and cleaner solutions.
Sometimes, just changing how the task is framed can unlock surprisingly elegant results.

## Example 1

In [None]:
task_num = 165

In [None]:
%%writefile task.py
def p(x,e=range):
 from collections import Counter as j,deque;o,h=len(x),len(x[0]);p=j(sum(x,[])).most_common()[0][0];n=[[0]*h for k in e(o)];i=[]
 for k in e(o):
  for l in e(h):
   if n[k][l]or x[k][l]==p:continue
   t=x[k][l];s=deque([(k,l)]);n[k][l]=1;r=[(k,l)]
   while s:
    y,m=s.popleft()
    for(u,b)in[(-1,-1),(-1,0),(-1,1),(0,-1),(0,1),(1,-1),(1,0),(1,1)]:
     if 0<=y+u<o and 0<=m+b<h and not n[y+u][m+b]and x[y+u][m+b]==t:n[y+u][m+b]=1;s.append((y+u,m+b));r.append((y+u,m+b))
   c,c=[k for(k,l)in r],[k for(l,k)in r];i.append((t,r,(min(c),min(c),max(c),max(c))))
 if not i:return x
 i.sort(key=lambda n:len(n[1]),reverse=1);q=i[0];l=i[1:]
 if not l:return x
 r=j(k for(k,l,q)in l).most_common()[0][0];n={};[n.setdefault(k,[]).append(l)for(k,l)in q[1]];[k.sort()for k in n.values()];n={};[n.setdefault(m,[]).append(y)for(k,l,u)in l if k==r for(y,m)in l];[k.sort()for k in n.values()];u=[k for(k,l)in{k:[l for(l,y)in q[1]if y==k]for(l,k)in q[1]}.items()if k in n and n[k][-1]>=l[-1]and any(x[y][k]==p for y in e(l[0],o)if y not in l)];u and[x[l].__setitem__(k,r)for k in sorted(u)for l in e(min(l for(l,y)in q[1]if y==k),o)if l not in[l for(l,y)in q[1]if y==k]and x[l][k]==p];return x

In [None]:
verify_program(task_num, load_examples(task_num))

In [None]:
show_examples(load_examples(task_num)['train'])
show_examples(load_examples(task_num)['arc-gen'][101:104])

### Simplification

The initial version of the code identifies the main shape, picks the dominant secondary color, and draws vertical r-colored fills beneath the main figure - essentially extending or connecting it to existing r-colored regions.

However, after reviewing several examples, a few key insights helped simplify the logic dramatically:

1. Fixed Grid Size:
The grid is always 20×20. This means we don’t need to dynamically check the dimensions - we can safely assume them and skip any length-related calculations.

2. Two Colors + Background:
Each puzzle includes only two non-zero colors plus the background (0).
Because the “spaceship” always sits above the main color area, we can assume that the bottom color is the main one.
This lets us skip complex shape detection or BFS-style pattern searches - simply identify the last non-zero color as the main color and the other as the stop color.

3. Column-wise Independence: 
We don’t need to understand the overall geometry of the shape.
Each column can be processed independently: fill from the bottom up until the stop color is reached.
This vectorized mindset often leads to much shorter (and more elegant) golf solutions.

### Golf Tricks Applied

To make the solution as compact as possible, I focused on using a few well-known golfing techniques that help reduce both code size and complexity:


* Flattening the grid with sum(g, []) instead of nested loops.
* Finding the last non-zero element efficiently using next(..., 0) on a reversed iterator.
* Getting the “other” color with a simple set difference and .pop() call.
* Performing column traversal neatly with zip(*g) to avoid explicit indexing.

Using these tricks, I ended up with the following concise implementation.
It might not reach the very top of the leaderboard, but it significantly reduces code length while keeping the logic clear and robust.

In [None]:
%%writefile task.py
def p(g):
 a=sum(g,[]);s=next(x for x in a[::-1] if x);f=(set(a)-{0,s}).pop()
 for j,c in enumerate(zip(*g)):
  if f in c and s in c[(i:=19-c[::-1].index(f))+1:]:
   for k in range(i,20): g[k][j]=g[k][j] or s
 return g

In [None]:
verify_program(task_num, load_examples(task_num))

## Example 2

In [None]:
task_num = 273

In [None]:
%%writefile task.py
l=enumerate
r=range
def p(u):
 u=[e[:]for e in u];f=[[e for(e,f)in l(e)if f==4]for e in u]
 for(q,m)in l(f):
  for(f,e)in l(m):
   for f in m[f+1:]:
    for p in r(q+1,len(u)):
     if u[p][e]==4and u[p][f]==4:
      for p in r(q+1,p):u[p][e+1:f]=[2]*(f-e-1)
 return u

In [None]:
verify_program(task_num, load_examples(task_num))

In [None]:
show_examples(load_examples(task_num)['train'])
show_examples(load_examples(task_num)['arc-gen'][101:104])

### Simplification

The initial version of the code creates a copy of the grid and, for each row, finds all column indices containing the value 4. For every pair of those 4s in the same row, it scans downward to find rows that also have 4s in the same two columns. Whenever such a match is found, it fills all intermediate rows between those columns with 2s, then returns the modified copy.

After exploring several examples, I realized that this process could be simplified quite a bit:

1. Fixed Pattern of 4s: The pattern of 4s is always consistent. The rectangles are defined by pairs of vertical borders, so enumerating all 4s and all possible pairs isn’t necessary.

2. In-place Mutation: Since the task only involves filling cells, we don’t actually need to make a copy of the grid. Updating it directly in-place keeps the code shorter and perfectly acceptable for this puzzle (and many similar ones).

3. Caching the Top Boundary: Using d.setdefault((x, y), i) lets us remember the top boundary for each column pair just once. This eliminates redundant downward scans - when a later row with the same (x, y) pair appears, we can instantly fill the entire vertical span in a single step.

### Golf Tricks Applied

To achieve the shortest possible version of the simplified logic, I applied several classic golf-coding techniques that help remove unnecessary structure and characters:


* Tuple keys for pairs: Using (x,y) as a dictionary key avoids extra data structures or branching.
* Bit trick for width: The expression ~(x-y) equals y-x-1, but it’s one character shorter - a neat micro-optimization.
* Slice fill: Using t[x+1:y] = [2]*… fills the entire horizontal span in a single step, replacing explicit loops.
* Row-slice iteration: for t in a[top+1:i]: iterates only over the rows that need filling, skipping index arithmetic and range construction.

Using these small but effective tricks, I arrived at the following version.
It doesn’t yet beat the absolute top score - but it significantly reduces total size while keeping the logic clean and readable.

In [None]:
%%writefile task.py
def p(a):
 d={}
 for i,r in enumerate(a):
  if 4 in r:
   x=r.index(4);y=r.index(4,x+1)
   for t in a[d.setdefault((x,y),i)+1:i]:t[x+1:y]=[2]*~(x-y)
 return a

In [None]:
verify_program(task_num, load_examples(task_num))

# Final Thoughts

Although these are just a few examples - and not from the hardest problems - they already show how small insights can bring the code much closer to the top scorers.

There are many other creative strategies shared by the community in discussions and comments, such as:

- recursion and string manipulation https://www.kaggle.com/competitions/google-code-golf-2025/discussion/611311#3301544
- alternative to neighbor checks https://www.kaggle.com/competitions/google-code-golf-2025/discussion/611930#3302376
- island check https://www.kaggle.com/competitions/google-code-golf-2025/discussion/612010#3302620
- and more

For me, the best next step is to keep adapting these golfing techniques and simplifications to different categories of puzzles. I also believe that borrowing tricks from simpler tasks and applying them in more complex challenges can lead to major breakthroughs.

And of course, it’s always worth learning from the best practices of the leaderboard leaders - that’s where the real mastery hides.

## Create submission

In [None]:
from zlib import compress

def zip_src(src):
    compression_level = 9  # Max Compression
    
    # We prefer that compressed source not end in a quotation mark
    while (compressed := compress(src, compression_level))[-1] == ord('"'): 
        src += b"#"
    
    def sanitize(b_in):
        """Clean up problematic bytes in compressed b-string"""
        b_out = bytearray()
        for b in b_in:
            if b == 0:         
                b_out += b"\\x00"
            elif b == ord("\r"): 
                b_out += b"\\r"
            elif b == ord("\\"): 
                b_out += b"\\\\"
            else: 
                b_out.append(b)
        return b"" + b_out
    
    compressed = sanitize(compressed)
    
    delim = b'"""' if ord("\n") in compressed or ord('"') in compressed else b'"'
    
    return b"#coding:L1\nimport zlib\nexec(zlib.decompress(bytes(" + \
        delim + compressed + delim + \
        b',"L1")))'

In [None]:
import os
source = "/kaggle/input/google-golf-code-tasks-dataset-com"
submission = "/kaggle/working/submission"
os.makedirs(submission, exist_ok=True)
os.chdir(submission)


# Copy tasks into submission folder
processed_tasks = 0
for task_num in range(1, 401):
    path_in = f"{source}/task{task_num:03d}.py"
    path_out = f"{submission}/task{task_num:03d}.py"
    
    if not os.path.exists(path_in):
        continue
    
    try:
        with open(path_in, "rb") as fin:
            code = fin.read()
        with open(path_out, "wb") as fout:
            fout.write(code)
        processed_tasks += 1
    except Exception as e:
        print(f"Error processing task{task_num:03d}: {e}")

print(f"Processed {processed_tasks} tasks")

In [None]:
import os
source = "/kaggle/input/google-golf-code-tasks-dataset-com"
submission = "/kaggle/working/submission"

total_save = 0
processed_tasks = 0

os.makedirs(submission, exist_ok=True)

# Process tasks 1-400
for task_num in range(1, 401):
    path_in = f"{source}/task{task_num:03d}.py"
    path_out = f"{submission}/task{task_num:03d}.py"
    
    if not os.path.exists(path_in):
        continue
    
    try:
        with open(path_in, "rb") as task_in:
            task_src = task_in.read()

        # Only compress if file has content
        if len(task_src) > 0:
            zipped_src = zip_src(task_src)
            improvement = len(task_src) - len(zipped_src)
            
            # Use compressed version if it saves space
            if improvement > 0:
                task_src = zipped_src
                total_save += improvement
            
            with open(path_out, "wb") as task_out:
                task_out.write(task_src)
            
            processed_tasks += 1
    
    except Exception as e:
        print(f"Error processing task{task_num:03d}: {e}")
        continue

print(f"Processed {processed_tasks} tasks")
print(f"Saved {total_save}b with zlib compression")

In [None]:
import zipfile

submission_zip = f"{submission}.zip"

with zipfile.ZipFile(submission_zip, "w", zipfile.ZIP_DEFLATED) as zipf:
    task_count = 0
    for task_num in range(1, 401):
        task_id = f"{task_num:03d}"
        src_path = f"{submission}/task{task_id}.py"
        
        if os.path.exists(src_path):
            zipf.write(src_path, arcname=f"task{task_id}.py")
            task_count += 1

print(f"Created submission zip with {task_count} tasks: {submission_zip}")

# Display zip file size
zip_size = os.path.getsize(submission_zip)
print(f"Submission zip size: {zip_size:,} bytes ({zip_size/1024:.1f} KB)")