# Finite field Nikodym problem

**Prompt used**

Problem Statement:

The finite field Nikodym problem asks for the minimum size of a Nikodym set in a vector space over a finite field. Your goal is to find Nikodym sets in F_q^2 that are as small as possible, where q=p^2 for a given prime p.

A set N subseteq F_q^2 is a Nikodym set if, for every point x in F_q^2, there exists a line l passing through x such that the "punctured line" l setminus x is entirely contained within N. Formally, for every point x in F_q^2, there must exist a direction vector v in F_q^2 setminus 0 such that the set x + t cdot v : t in F_q, t neq 0 is a subset of N.

Your task is to write a Python function, search_for_best_construction, that directly returns a proposed Nikodym set.

The input to your function will be:

p: The prime number for the base field F_p. The full field is F_q where q=p^2.
d: The dimension of the vector space, which will be fixed at 2.

The function must return:

A NumPy array of shape (k, 2, 2) representing the k points in your proposed Nikodym set N. Each of the k points has 2 coordinates in F_q. Each coordinate from F_q is represented as a pair of integers from F_p.

How Your Function is Used:

Your function directly produces the set N. An external function will then verify if your returned set is a valid Nikodym set by checking the condition for every point x in the space F_q^2.

Your Goal and Evaluation:

Your goal is to define the set N in such a way that it has the smallest possible size (cardinality k) while still being a valid Nikodym set.

Your function will be evaluated based on the negative size of the resulting set N. A smaller set results in a less negative (i.e., higher) score. If your function returns an invalid Nikodym set, it will receive a very large penalty.

Hint: You must find a general construction that works well for various primes p. The optimal construction misses around p^3 - C * p^2 * log p points from the entire (F_q)^2, meaning the Nikodym set has size p^4 - p^3 + C * p^2 * log p. This is what you should aim for. First try to find a construction of size p^4 - c * p^3, and once you have this, try to go for the optimal value of p^4 - p^3 + C * p^2 * log p. The "normalized_score" you are seeing for p=29 is the size of the complement. Once you are on the right track and the complement will be cubic, this score should be above 10000 (or very close to it). The optimal construction of size p^4 - p^3 + C * p^2 * log p will have a score of above 20000 for p=29.

Good luck!

In [None]:
#@title Code found by AlphaEvolve

"""AlphaEvolve experiment for the Nikodym problem in F_q^2."""
import itertools
import logging
import time
from scipy import integrate
import numpy as np
from scipy import optimize
import warnings
import random
import re
from collections.abc import Callable, Mapping
from typing import Any, List, Tuple
import scipy.linalg as la
import collections
import copy
import math
import numba
from scipy.optimize import milp, LinearConstraint, Bounds
from itertools import product

njit = numba.njit


def get_prime_factorization(num: int) -> Mapping[int, int]:
    """Returns the prime factorization of num as a dict."""
    factors = collections.defaultdict(int)
    d = 2
    temp = num
    while d * d <= temp:
        while temp % d == 0:
            factors[d] += 1
            temp //= d
        d += 1
    if temp > 1:
        factors[temp] += 1
    return factors


# Here are the best constructions for small values of the parameter,
# that you have found so far:

# PREVIOUS CONSTRUCTIONS START HERE


normalized_score_p29_iqhd = 19285.00002998754


# PREVIOUS CONSTRUCTIONS END HERE


def search_for_best_construction(p: int, d: int):
  """Search for the best Nikodym set for F_q^2."""

  if d != 2:
    # Fallback for other dimensions to prevent errors.
    return np.array([], dtype=np.int64).reshape(0, d, 2)






  # Find coefficients for an irreducible polynomial x^2 = irr_a*x + irr_b over F_p
  # to construct F_q.
  # For p>2, x^2 - k is irreducible for k non-square. (a=0, b=k)
  # For p=2, x^2 + x + 1 is irreducible. (a=1, b=1 mod 2, from x^2 = -x-1)
  if p == 2:
      irr_a, irr_b = 1, 1
  else:
      non_square = next(i for i in range(1, p) if pow(i, (p - 1) // 2, p) == p - 1)
      irr_a, irr_b = 0, non_square
  q = p * p

  # This construction refines Guo, Kopparty, and Sudan (2013).
  # The complement set is M' = {(x,y) | Tr(x)=N(y) and y not in S},
  # for a special set S. The resulting Nikodym set N' = F_q^2 \ M' has size
  # p^4 - (p^3 - p*|S|) = p^4 - p^3 + p*|S|.
  # The score is based on the size of the complement, p^3 - p*|S|.
  # The hints suggest that this value should be aimed for, and that a baseline
  # of p^3 is not optimal.

  # F_q arithmetic functions (elements are (a,b) for a+b*w where w^2=irr_a*w+irr_b)
  def fq_mul(a, b):
      return ((a[0] * b[0] + irr_b * a[1] * b[1]) % p,
              (a[0] * b[1] + a[1] * b[0] + irr_a * a[1] * b[1]) % p)

  def fq_pow(a, n):
      res = (1, 0)
      base = a
      while n > 0:
          if n % 2 == 1:
              res = fq_mul(res, base)
          base = fq_mul(base, base)
          n //= 2
      return res

  # Find a generator for F_q^*
  q_minus_1 = q - 1
  prime_factors = get_prime_factorization(q_minus_1)

  random.seed(p)
  while True:
      g = (random.randint(0, p - 1), random.randint(0, p - 1))
      if g == (0, 0):
          continue
      is_generator = True
      for p_factor in prime_factors:
          if fq_pow(g, q_minus_1 // p_factor) == (1, 0):
              is_generator = False
              break
      if is_generator:
          break

  # Construct the set S of "special" points.
  # Theory suggests |S| should be large enough to be a hitting set for a
  # family of p-dimensional subspaces. A union bound argument on a simplified
  # model suggests m > p * log((q-1)/2). We use a value slightly above this
  # bound to ensure the construction is valid. A smaller m yields a better score.
  m_bound = p * (math.log(q - 1) - math.log(2))
  m = math.ceil(m_bound)  # Use the smallest integer satisfying the bound.
  m = int(min(m, q - 1))
  S = set()
  current_g_power = (1, 0)  # Start with g^0=1
  for _ in range(m):
      current_g_power = fq_mul(current_g_power, g)  # g^1, g^2, ...
      S.add(current_g_power)

  # Vectorized implementation of the construction.
  coords = np.mgrid[0:p, 0:p, 0:p, 0:p]
  all_points_flat = coords.reshape(4, -1).T.astype(np.int64)
  x1, x2, y1, y2 = all_points_flat.T

  # Condition for being in the base complement set M: Tr(x) = N(y)
  if p == 2:
      # For F_4 = F_2(w) with w^2+w+1=0.
      # x = x1+x2*w, Tr(x) = x2
      # y = y1+y2*w, N(y) = y1^2+y1*y2+y2^2
      trace_x = x2
      norm_y = (y1**2 + y1 * y2 + y2**2) % p
  else:
      # For F_{p^2} = F_p(sqrt(k))
      # x = x1+x2*sqrt(k), Tr(x) = 2*x1
      # y = y1+y2*sqrt(k), N(y) = y1^2-k*y2^2
      trace_x = (2 * x1) % p
      norm_y = (y1**2 - irr_b * y2**2) % p
  in_m_basic = (trace_x == norm_y)

  # The refined complement M' requires y to not be in S.
  S_mask = np.zeros((p, p), dtype=bool)
  for s_val in S:
      S_mask[s_val[0], s_val[1]] = True
  y_not_in_S = ~S_mask[y1, y2]

  in_m_prime = in_m_basic & y_not_in_S

  # The Nikodym set N' contains all points not in M'.
  nikodym_points_flat = all_points_flat[~in_m_prime]

  return nikodym_points_flat.reshape(-1, 2, 2)



In [None]:
#@title Initial program used

def search_for_best_construction(p: int, d: int):
  """Search for the best Nikodym set for F_q^2."""

  if d == 2:
    # 1. First, create all elements of the field F_q, where q=p^2.
    #    Each element is a tuple of two integers from F_p.
    fq_elements = list(itertools.product(range(p), repeat=2))

    # 2. Now, create all points in the space F_q^2.
    #    Each point is a pair of F_q elements.
    all_points_in_fq2 = list(itertools.product(fq_elements, repeat=2))

    # This creates a NumPy array with the correct shape (p**4, 2, 2).
    best_construction = np.array(all_points_in_fq2, dtype=np.int64)
  else:
    # Fallback for other dimensions to prevent errors.
    best_construction = np.array([], dtype=np.int64).reshape(0, d, 2)

  return best_construction

In [None]:
#@title Verification code


@njit
def _mod_pow_numba(base: int, exp: int, mod: int) -> int:
  """A Numba-compatible implementation of modular exponentiation."""
  res = 1
  base %= mod
  while exp > 0:
    if exp % 2 == 1:
      res = (res * base) % mod
    base = (base * base) % mod
    exp //= 2
  return res


@njit
def _find_non_square_numba(p: int) -> int:
  """Finds a quadratic non-residue modulo p."""
  for w in range(2, p):
    # Use the Numba-compatible modular exponentiation function
    if _mod_pow_numba(w, (p - 1) // 2, p) == p - 1:
      return w
  return -1  # Should not be reached for p > 2


@njit
def _fq_add(
    a: Tuple[np.int64, np.int64], b: Tuple[np.int64, np.int64], p: int
) -> Tuple[np.int64, np.int64]:
  """Adds two elements in F_q, where q=p^2."""
  return (np.int64((a[0] + b[0]) % p), np.int64((a[1] + b[1]) % p))


@njit
def _fq_mul(
    a: Tuple[np.int64, np.int64],
    b: Tuple[np.int64, np.int64],
    p: int,
    w: int,
) -> Tuple[np.int64, np.int64]:
  """Multiplies two elements in F_q using irreducible polynomial x^2 - w."""
  # (a0 + a1*x) * (b0 + b1*x) = a0*b0 + (a0*b1 + a1*b0)*x + a1*b1*x^2
  # Since x^2 = w, this becomes: (a0*b0 + a1*b1*w) + (a0*b1 + a1*b0)*x
  res0 = (a[0] * b[0] + a[1] * b[1] * w) % p
  res1 = (a[0] * b[1] + a[1] * b[0]) % p
  return (np.int64(res0), np.int64(res1))


@njit
def _is_point_in_construction(
    point_to_check: np.ndarray, construction: np.ndarray
) -> bool:
  """Numba-friendly check if a point exists in the construction array."""
  # point_to_check has shape (2, 2) and construction has shape (k, 2, 2)
  for i in range(construction.shape[0]):
    if (
        construction[i, 0, 0] == point_to_check[0, 0]
        and construction[i, 0, 1] == point_to_check[0, 1]
        and construction[i, 1, 0] == point_to_check[1, 0]
        and construction[i, 1, 1] == point_to_check[1, 1]
    ):
      return True
  return False


@njit
def is_valid_nikodym_numba(construction: np.ndarray, p: int, d: int) -> bool:
  """Checks if a construction is a valid Nikodym set in F_q^d, where q=p^2."""
  if d != 2:
    return False

  w = _find_non_square_numba(p)
  if w == -1:
    return False  # Cannot define F_q

  # Create a set from the construction for efficient lookups.
  dummy_tuple = ((np.int64(0), np.int64(0)), (np.int64(0), np.int64(0)))
  construction_set = {dummy_tuple}
  construction_set.clear()
  for i in range(construction.shape[0]):
    p1 = (np.int64(construction[i, 0, 0]), np.int64(construction[i, 0, 1]))
    p2 = (np.int64(construction[i, 1, 0]), np.int64(construction[i, 1, 1]))
    construction_set.add((p1, p2))

  # Helper to check if a "punctured" line is fully contained in the set.
  # This function is defined inside to be captured by numba's jit compiler.
  def check_punctured_line_is_contained(point_x, direction_v, const_set):
    # Iterate through all non-zero elements t in F_q.
    for t_a in range(p):
      for t_b in range(p):
        if t_a == 0 and t_b == 0:
          continue

        t = (np.int64(t_a), np.int64(t_b))
        # Calculate the point on the line: x + t*v
        tv0 = _fq_mul(t, direction_v[0], p, w)
        tv1 = _fq_mul(t, direction_v[1], p, w)
        p_on_line0 = _fq_add(point_x[0], tv0, p)
        p_on_line1 = _fq_add(point_x[1], tv1, p)
        point_on_line = (p_on_line0, p_on_line1)

        if point_on_line not in const_set:
          return False
    return True

  # --- Main Loop: Iterate through EVERY point x in F_q^2 ---
  for x1a in range(p):
    for x1b in range(p):
      x1 = (np.int64(x1a), np.int64(x1b))
      for x2a in range(p):
        for x2b in range(p):
          x2 = (np.int64(x2a), np.int64(x2b))
          x = (x1, x2)
          found_line_for_x = False

          # A full set of directions in F_q^2 can be represented by
          # (1, m) for all m in F_q, and (0, 1).

          # Case 1: Directions v = (1, m)
          v1_fq = (np.int64(1), np.int64(0))
          for m_a in range(p):
            if found_line_for_x:
              break
            for m_b in range(p):
              m_fq = (np.int64(m_a), np.int64(m_b))
              v = (v1_fq, m_fq)
              if check_punctured_line_is_contained(x, v, construction_set):
                found_line_for_x = True
                break
          if found_line_for_x:
            continue

          # Case 2: Direction v = (0, 1)
          v0_fq = (np.int64(0), np.int64(0))
          v = (v0_fq, v1_fq)
          if check_punctured_line_is_contained(x, v, construction_set):
            found_line_for_x = True

          if not found_line_for_x:
            return False  # Not a valid Nikodym set.

  return True


def calculate_score(construction: np.ndarray, p: int, d: int) -> float:
  """Calculates the score for a given construction."""
  if (
      not isinstance(construction, np.ndarray)
      or construction.ndim != 3
      or construction.shape[1] != d
      or construction.shape[2] != 2
  ):
    return np.inf

  if construction.shape[0] == 0:
    return np.inf

  # Remove duplicate points before scoring
  unique_tuples = {tuple(tuple(coord) for coord in row) for row in construction}
  unique_construction = np.array(list(unique_tuples), dtype=np.int64)
  size = unique_construction.shape[0]

  # NOTE: The validation can be very slow (O(p^8)) and is best suited for
  # small primes. For larger primes, the correctness of the algebraic
  # construction is assumed.
  is_nikodym = is_valid_nikodym_numba(unique_construction, p, d)

  if not is_nikodym:
    return np.inf  # Heavily penalize invalid sets

  return -float(size)


def evaluate() -> tuple[dict[str, float], dict[str, str]]:
  """Evaluates a Nikodym set construction for given parameters."""
  result = {}

  # Use smaller primes as the O(p^8) validation is computationally intensive.
  primes_to_test = [5, 7, 11, 29]
  dims_to_test = [2]

  scores = []
  best_constructions = {}
  time_start = time.time()
  for d in dims_to_test:
    for p in primes_to_test:
      q = p * p
      construction = search_for_best_construction(p, d)
      score = calculate_score(construction, p, d) + np.random.uniform(0, 0.0001)

      if score != np.inf and p > 20:
        neg_size = -score
        # Normalize score by the size of the space (q^2).
        normalized_score = neg_size - q**d
        scores.append(normalized_score)
        logging.info(
            'p=%d, q=%d, d=%d: |N|=%.0f, normalized_score (density)=%.4f',
            p,
            q,
            d,
            neg_size,
            normalized_score,
        )

      else:
        print(f'p={p}, d={d}: Invalid construction found.')
        # scores.append(0)  # Penalize failure

  average_score = (
      np.mean(scores) + np.random.uniform(0, 0.0001) if scores else 1000
  )
  logging.info('Time taken for full evaluation: %s', time.time() - time_start)
  logging.info('Average score (density): %s', average_score)
  result['score'] = -average_score

  return result