Permalink
Cannot retrieve contributors at this time
1219 lines (1077 sloc)
59.5 KB
| \documentclass[12pt]{article} | |
| \usepackage{amsmath,amssymb,amsthm,amscd,amsfonts} | |
| \usepackage{tikz} | |
| \usepackage{url} | |
| \usepackage{fullpage} | |
| \usepackage{mathtools} | |
| \usepackage{float} | |
| %\usepackage{rotating} | |
| \usepackage{pdflscape} | |
| \usepackage[colorlinks=true, | |
| linkcolor=webgreen, | |
| filecolor=webbrown, | |
| citecolor=webgreen]{hyperref} | |
| \definecolor{webgreen}{rgb}{0,.5,0} | |
| \definecolor{webbrown}{rgb}{.6,0,0} | |
| \usepackage{color} | |
| \usepackage{microtype} | |
| \DeclarePairedDelimiter\abs{\lvert}{\rvert} | |
| \DeclarePairedDelimiter\paren{\lparen}{\rparen} | |
| \def\subw{\mathrel{\triangleleft}} | |
| \DeclareMathOperator\supe{sup} | |
| \DeclareMathOperator\subb{sub} | |
| \theoremstyle{plain} | |
| \newtheorem{theorem}{Theorem} | |
| \newtheorem{corollary}[theorem]{Corollary} | |
| \newtheorem{lemma}[theorem]{Lemma} | |
| \newtheorem{proposition}[theorem]{Proposition} | |
| %\newtheorem{test}{Test} | |
| \theoremstyle{definition} | |
| \newtheorem{definition}[theorem]{Definition} | |
| \newtheorem{example}[theorem]{Example} | |
| \newtheorem{conjecture}[theorem]{Conjecture} | |
| %\theoremstyle{remark} | |
| %\newtheorem{remark}[theorem]{Remark} | |
| \newtheorem{problem}[theorem]{Problem} | |
| \newcommand{\0}{\mathtt{0}} | |
| \newcommand{\1}{\mathtt{1}} | |
| \newcommand{\2}{\mathtt{2}} | |
| \newcommand{\3}{\mathtt{3}} | |
| \newcommand{\4}{\mathtt{4}} | |
| \newcommand{\5}{\mathtt{5}} | |
| \newcommand{\6}{\mathtt{6}} | |
| \newcommand{\7}{\mathtt{7}} | |
| \newcommand{\8}{\mathtt{8}} | |
| \newcommand{\9}{\mathtt{9}} | |
| \newcommand{\A}{\mathtt{A}} | |
| \newcommand{\B}{\mathtt{B}} | |
| \newcommand{\union}{\cup} | |
| \newcommand{\set}[2]{\{\,#1{}:{}#2\,\}} | |
| \renewcommand{\a}{@} | |
| % Obfuscate @ sign to help avoid spam | |
| \usepackage{auto-pst-pdf,pst-text} | |
| \renewcommand{\a}{\begin{pspicture}\pscharpath[fillstyle=solid,fillcolor=black,linestyle=none]{@}\end{pspicture}} | |
| \newcommand{\seqnum}[1]{\href{http://oeis.org/#1}{\underline{#1}}} | |
| \newcommand{\updated}[1]{{\color{red}#1}} | |
| \renewcommand{\updated}[1]{#1} | |
| \title{Minimal Elements for the Prime Numbers} | |
| \author{Curtis Bright\\ | |
| School of Computer Science\\ | |
| University of Waterloo\\ | |
| Waterloo, ON N2L 3G1\\ | |
| Canada\\ | |
| {\tt cbright\a uwaterloo.ca} \\ | |
| \ \\ | |
| Raymond Devillers\\ | |
| D\'epartement d'Informatique, CP 212\\ | |
| Universit\'e Libre de Bruxelles \\ | |
| B-1050 Bruxelles\\ | |
| Belgium\\ | |
| {\tt rdevil\a ulb.ac.be} \\ | |
| \ \\ | |
| Jeffrey Shallit \\ | |
| School of Computer Science\\ | |
| University of Waterloo\\ | |
| Waterloo, ON N2L 3G1\\ | |
| Canada \\ | |
| {\tt shallit\a cs.uwaterloo.ca}} | |
| \begin{document} | |
| \maketitle | |
| \begin{abstract} | |
| We say a string of symbols $s$ is \textit{minimal} for a language $L$ | |
| if $s$ is a member of $L$, and it is not possible to obtain another | |
| member of $L$ by striking out one or more symbols from $s$. Although | |
| the set $M(L)$ of minimal strings is necessarily finite, determining | |
| it explicitly for a given $L$ can be a difficult computational problem. | |
| We use some number-theoretic heuristics to compute $M(L)$, where $L$ | |
| is the language of base-$b$ representations of the prime numbers, | |
| for $2 \leq b \leq 30$. | |
| \end{abstract} | |
| \section{Introduction} | |
| Problems about the digits of prime numbers have a long history, and many | |
| of them are still unsolved. For example, are there infinitely many | |
| primes, all of whose base-$10$ digits are $\1$? Currently, | |
| there are only five such ``repunits'' | |
| known \cite{WD}, | |
| corresponding to $(10^p-1)/9$ for $p \in \{ 2, 19, 23, 317, 1031 \}$. | |
| It seems likely that four more are given by | |
| $p \in \{ 49081, 86453, 109297, 270343\}$, but this has not yet been | |
| rigorously proven. | |
| Another problem on the digits of primes was introduced by the | |
| third author \cite{Sh00}. To describe it, we need some definitions. | |
| We say that a string $x$ is a \textit{subword} of a string $y$, and | |
| we write $x \subw y$, if | |
| one can strike out zero or more symbols of $y$ to get $x$. | |
| For example, {\tt string} is a subword of | |
| {\tt Meistersinger}. | |
| (In the literature, this concept is sometimes called a ``scattered | |
| subword'' or ``substring'' or ``subsequence''.) | |
| A \textit{language} is a set of strings. A string $s$ is \textit{minimal} | |
| for $L$ if (a) $s \in L$ and (b) if $x \in L$ and $x \subw s$, then | |
| $x = s$. The set of all minimal strings of $L$ is denoted $M(L)$. | |
| In this paper, we describe a heuristic | |
| technique for determining $M(L_b)$ in the | |
| case where $L_b$ consists of the representations, in base $b$, of the | |
| the prime numbers $\lbrace 2, 3, 5, \dotsc \rbrace$. We obtain | |
| a complete characterization of $M(L_b)$ for bases | |
| $2 \leq b \leq 16$ and \updated{$b = 18$, $20$, $22$, $23$, $24$, and $30$. For the remaining | |
| bases $b = 17$, $19$, $21$, and $25\leq b\leq 29$,} | |
| we obtain results that allow us to ``almost'' completely characterize this set. | |
| The same technique can also be applied to find minimal sets for subsets of prime numbers. | |
| For example, we were able to determine the minimal set for primes of the form $4n+1$ | |
| represented in base 10 (and similarly for those of the form $4n+3$). | |
| This successfully completes the sequences | |
| \seqnum{A111055} and \seqnum{A111056} in the | |
| \textit{Encyclopedia of Integer Sequences}~\cite{oeis}, | |
| which had been incomplete | |
| since their introduction to the Encyclopedia in 2005. | |
| Finally, we also show that determining the minimal set for | |
| the composite numbers represented in base $b$ is a significantly easier problem, | |
| and explicitly compute the minimal set for all $2\leq b\leq 30$. | |
| \subsection{Notation} | |
| In what follows, if $x$ is a string of symbols over the alphabet | |
| $\Sigma_b \coloneqq \lbrace \0, \1, \dotsc, b-1 \rbrace$ we let | |
| $[x]_b$ denote the evaluation of $x$ in base $b$ (starting with the | |
| most significant digit)\updated{, and $[\epsilon]_b\coloneqq0$ where $\epsilon$ | |
| is the empty string}. This is extended to languages as follows: | |
| $[L]_b \coloneqq \set{ [x]_b }{ x \in L }$. | |
| We use the convention that $\A \coloneqq 10$, $\B \coloneqq 11$, and so forth, | |
| to conveniently represent strings of symbols in base $b > 10$. | |
| We let $(x)_b$ be the canonical representation | |
| of $x$ in base-$b$, that is, the representation without leading zeroes. | |
| %The set of all canonical representations in base $b$ is denoted | |
| %$C_b$. | |
| Finally, as usual, for a language $L$ we let | |
| $L^n \coloneqq \overbrace{LL\dotsm L}^n$ and $L^* \coloneqq \bigcup_{i \geq 0} L^i$. | |
| \section{Why minimal sets are interesting} | |
| One reason why | |
| the minimal set $M(L)$ of a language $L$ is interesting is because it | |
| allows us to compute two natural and related languages, | |
| defined as follows: | |
| \begin{align*} | |
| \subb(L) &\coloneqq \set{x \in \Sigma^*}{\text{there exists $y \in L$ such that $x \subw y$}} ; \\ | |
| \supe(L) &\coloneqq \set{x \in \Sigma^*}{\text{there exists $y \in L$ such that $y \subw x$}} . | |
| \end{align*} | |
| An amazing fact is that $\subb(L)$ and $\supe(L)$ are always regular. | |
| This follows from | |
| the following classical theorem due to Higman \cite{Hi52} and | |
| Haines \cite{Ha69}. | |
| \begin{theorem} | |
| For every language\/ $L$, there are only finitely many minimal strings. | |
| \end{theorem} | |
| Indeed, we have $\supe(L) = \supe(M(L))$ | |
| and $\Sigma^* - \subb(L) = \supe(M(\Sigma^* - \subb(L)))$, and | |
| the superword language of a finite language is regular, since | |
| \[ \supe\paren[\big]{\{w_1,\dotsc,w_n\}} = \bigcup_{i=1}^n \Sigma^* w_{i,1} \Sigma^* \dotsm \Sigma^* w_{i,\abs{w_i}} \Sigma^* \] | |
| where $w_i=w_{i,1}\dotsm w_{i,\abs{w_i}}$ with $w_{i,j}\in\Sigma$. | |
| \section{Why the problem is hard} | |
| Determining $M(L)$ for arbitrary $L$ | |
| is in general unsolvable, and can be difficult even when $L$ | |
| is relatively simple~\cite{GHK07,GHK09}. | |
| The following is a ``semi-algorithm'' that | |
| is guaranteed to produce $M(L)$, but it is not so easy to implement: | |
| \begin{figure}[H] | |
| (1) $ M \coloneqq \emptyset$ \\ | |
| (2) while ($L \not= \emptyset$) do \\ | |
| \hphantom{} \qquad (3) choose $x$, a shortest string in $L$ \\ | |
| \hphantom{} \qquad (4) $ M \coloneqq M \union \lbrace x \rbrace$ \\ | |
| \hphantom{} \qquad (5) $ L \coloneqq L - \supe(\lbrace x \rbrace) $ | |
| \end{figure} | |
| In practice, for arbitrary $L$, | |
| we cannot feasibly carry out step (5). Instead, we work | |
| with $L'$, some regular overapproximation to $L$, until we can show $L' = | |
| \emptyset$ (which implies $L = \emptyset$). In practice, | |
| $L'$ is usually chosen to be a finite union of sets of the form | |
| $L_1 L_2^* L_3$, where each of $L_1$, $L_2$, $L_3$ is finite. | |
| In the case we consider in this paper, | |
| we then have to determine whether such a language contains | |
| a prime or not. | |
| However, it is not even known if the following simpler decision problem is | |
| recursively solvable: | |
| \begin{problem} | |
| Given strings $x$, $y$, $z$, and a base $b$, does there exist a prime | |
| number whose base-$b$ expansion is of the form $x \overbrace{yy\cdots y}^n z$ | |
| for some $n \geq 0$? | |
| \end{problem} | |
| An algorithm to solve this problem, for example, would allow us to decide | |
| if there are any additional Fermat primes (of the form $2^{2^n}+1$) | |
| other than the known ones (corresponding to $n = 0$, $1$, $2$, $3$, $4$). | |
| To see this, take $b\coloneqq2$, $x\coloneqq\1$, $y\coloneqq\0$, and $z\coloneqq\0^{16}\1$. | |
| Since if $2^n+1$ is prime then $n$ must be a power of two, a prime of the | |
| form $[xy^*z]_b$ must be a new Fermat prime. | |
| Therefore, in practice, we are forced to try to rule out prime | |
| representations based on heuristics such as | |
| modular techniques and factorizations. | |
| This is discussed in the next section. | |
| \section{Some useful lemmas}\label{seclemmas} | |
| It will be necessary for our algorithm to determine if families of the | |
| form $[xL^*z]_b$ contain a prime or not. We use two different | |
| heuristic strategies | |
| to show that such families contain no primes. | |
| In the first strategy, we mimic the well-known technique of ``covering | |
| congruences'' \cite{Choi}, | |
| by finding some finite set $S$ of integers $N > 1$ such that | |
| every number in a given family is divisible by some element of $S$. | |
| In the second strategy, we attempt to find a | |
| difference-of-squares or difference-of-cubes factorization. | |
| \subsection{The first strategy}\label{subsecfirststrat} | |
| We start with the simplest version of the idea: to find | |
| an $N > 1$ that divides each element of the family $[xL^*z]_b$. | |
| At first glance, this would require | |
| checking that $N$ divides $xL^nz$ for $n=0$, $1$, $2$, $\dotsc$. | |
| However, the following lemma shows that it is only necessary to check | |
| the two cases $n=0$ and $1$. Although divisibility based on digital | |
| considerations has a long history (e.g., \cite[Chap.~XII]{Dick}), | |
| we could not find these kinds of results in the literature. | |
| \begin{lemma}\label{lemone} | |
| Let\/ $x$, $z\in \Sigma^*_b$, and let\/ $L\subseteq\Sigma^*_b$. | |
| Then\/ $N$ divides all numbers of the form\/ $[xL^*z]_b$ | |
| if and only if\/ $N$ divides\/ $[xz]_b$ and all numbers of the form\/ $[xLz]_b$. | |
| \end{lemma} | |
| \begin{proof}%($\Rightarrow$) | |
| Let $y=y_1\dotsm y_n\in L^*$, where $y_1$, $\dotsc$, $y_n\in L$. | |
| By telescoping we have | |
| \[ [xyz]_b - [xz]_b = \sum_{i=1}^{n}([xy_{i}y_{i+1}\dotsm y_n z]_b-[xy_{i+1}\dotsm y_n z]_b) . \] | |
| Cancelling the final $\abs{y_{i+1}\dotsm y_nz}$ base-$b$ digits in the summand difference --- | |
| which are identical --- this becomes | |
| \[ [xyz]_b = [xz]_b + \sum_{i=1}^{n}b^{\lvert{y_{i+1}\dotsm y_n z}\rvert}([xy_i]_b-[x]_b) . \] | |
| But $b^{\lvert z\rvert}([xy_i]_b-[x]_b)=[xy_iz]_b-[xz]_b$ by | |
| adding and subtracting $[z]_b$, so we have | |
| \[ [xyz]_b = [xz]_b + \sum_{i=1}^{n}b^{\vert{y_{i+1}\dotsm y_n}\rvert}([xy_iz]_b-[xz]_b) . \] | |
| Since $N\mid[xz]_b$ and $N\mid[xy_iz]_b$ for each $1\leq i\leq n$, | |
| it follows that $N\mid[xyz]_b$. | |
| The other direction is clear, since $[xz]_b$ and numbers of the | |
| form $[xLz]_b$ are both of the form $[xL^*z]_b$. | |
| \end{proof} | |
| In practice, our algorithm employs this lemma with | |
| $L\coloneqq\{y_1,\dotsc,y_n\}\subseteq\Sigma_b$, and all numbers of the form | |
| $[xL^*z]_b$ are shown to be composite with the following corollary. | |
| \begin{corollary}\label{corone} | |
| If\/ $1<\gcd([xz]_b,[xy_1z]_b,\dotsc,[xy_nz]_b)<[xz]_b$ | |
| then all numbers of the form\/ $[x\{y_1,\dotsc,y_n\}^*z]_b$ are composite. | |
| \end{corollary} | |
| \begin{proof} | |
| By Lemma~\ref{lemone}, we know | |
| that $N\coloneqq\gcd([xz]_b,[xy_1z]_b,\dotsc,[xy_nz]_b)>1$ | |
| divides all numbers of the form $[x\{y_1,\dotsc,y_n\}^*z]_b$. | |
| By the size condition $N$ is strictly less than each such number, and | |
| so is a nontrivial divisor. | |
| \end{proof} | |
| \begin{example} | |
| Since $\gcd(49, 469)=7$, every number with base-$10$ representation | |
| of the form $\4\6^*\9$ is | |
| divisible by $7$. | |
| Since $1 < 7 < 49$, each such number is composite. | |
| \end{example} | |
| We also generalize this to the following corollary in the case where | |
| a single divisor does not divide each number in the family. | |
| \begin{corollary}\label{cortwo} | |
| Let\/ $L \coloneqq \lbrace y_1, y_2, \ldots, y_n \rbrace$. | |
| If \[N_0\coloneqq\gcd \paren[\big]{ \{ [xz]_b \} \union [xL^2 z]_b} \] | |
| and | |
| \[ N_1\coloneqq \gcd \paren[\big]{[xLz]_b \union [xL^3z]_b} \] | |
| lie strictly between\/ $1$ and\/ $[xz]_b$, then all numbers of the form\/ | |
| $[xL^*z]_b$ are composite. | |
| \end{corollary} | |
| \begin{proof} | |
| By Lemma~\ref{lemone} applied | |
| to $[x(L^2)^*z]_b$, we know that $N_0$ divides all numbers of the | |
| form $[x L^* z]_b$ in which an even number of $y_i$ appear. | |
| By Lemma~\ref{lemone} on $[xy_i(L^2)^*z]_b$ | |
| for each $1\leq i\leq n$, we know that $N_1$ | |
| divides all numbers of the form $[x L^* z]_b$ for which an | |
| odd number of $y_i$ appear. | |
| By the size conditions, $N_0$ and $N_1$ are nontrivial divisors. | |
| \end{proof} | |
| \begin{example} | |
| Since $\gcd([\6]_9,[\6\1\1]_9)=2$, every number with | |
| base-$9$ representation of the form $\6\1^*$ of | |
| odd length is divisible by $2$. | |
| Since $\gcd([\6\1]_9,[\6\1\1\1]_9)=5$, | |
| every number with base-$9$ representation of the form $\6\1^*$ | |
| of even length is divisible by $5$. Since these numbers lie | |
| strictly between $1$ and $6$, every number with base-$9$ | |
| representation of the form $\6\1^*$ is | |
| composite. | |
| \end{example} | |
| We also note that it is simple to generalize Corollary~\ref{cortwo} | |
| to apply to | |
| check if there are divisors $N_0$, $N_1$, $\dotsc$, $N_{k-1}$ such that | |
| $N_i$ divides all numbers of the form $[x\{y_1,\dotsc,y_n\}^*z]_b$ in which | |
| the number of $y_i$ appearing is congruent to $i\bmod k$. | |
| \begin{example} | |
| Let $b \coloneqq 16$. | |
| Then | |
| $7$ divides $[\8\A\0\1]_b$ and $[\8\A\0\A\A\A\1]_b$. | |
| Furthermore, $13$ divides | |
| $[\8\A\0\A\1]_b$ and $[\8\A\0\A\A\A\A\1]_b$, | |
| and $3$ divides $[\8\A\0\A\A\1]_b$ and | |
| $[\8\A\0\A\A\A\A\A\1]_b$. Thus all numbers with base-$16$ | |
| representation of the form $\8\A\0\A^*\1$ are | |
| divisible by either $7$, $13$, or $3$, depending on their | |
| length mod $3$. | |
| \end{example} | |
| A version of Lemma~\ref{lemone} which applies to the most general kind | |
| of family we need to consider ($x_1L_1^*\dotsm x_mL_m^*$, where we allow the case $L_m^*=\emptyset$) | |
| is formulated in Lemma~\ref{lemtwo}. | |
| \begin{lemma}\label{lemtwo} | |
| Let\/ $x_1$, $\dotsc$, $x_m\in \Sigma^*_b$, and\/ $L_1$, $\dotsc$, | |
| $L_m\subseteq\Sigma^*_b$. Then\/ | |
| $N$ divides all numbers of the form\/ $[x_1L_1^*x_2L_2^*\dotsm x_mL_m^*]_b$ | |
| if and only if\/ | |
| $N$ divides\/ $[x_1\dotsm x_m]_b$ and all numbers of the form\/ | |
| $[x_1L_1x_2x_3\dotsm x_m]_b$, $\dotsc$, $[x_1\dotsm x_{m-1}x_mL_m]_b$. | |
| \end{lemma} | |
| \begin{proof} | |
| Say $w\in x_1L_1^*x_2L_2^*\dotsm x_mL_m^*$; then there exist | |
| $y_{i,1}$, $\dotsc$, $y_{i,n_i}\in L_i$ such that | |
| \[ w = x_1y_{1,1}\dotsm y_{1,n_1}x_2y_{2,1}\dotsm y_{2,n_2}\dotsm x_m y_{m,1}\dotsm y_{m,n_m} \] | |
| for $1\leq i\leq m$. | |
| As in the proof of Lemma~\ref{lemone}, we have that | |
| \[ [w]_b = [x_1\dotsm x_m]_b + \sum_{i=1}^m\sum_{j=1}^{n_i} b^{\abs{y_{i,j+1}\dotsm y_{m,n_m}}}([x_1\dotsm x_i y_{i,j} x_{i+1}\dotsm x_m]_b-[x_1\dotsm x_m]_b) \] | |
| from which the claim follows. | |
| \end{proof} | |
| As in Lemma~\ref{lemone}, we typically apply this lemma in the case | |
| where each $L_i\subseteq\Sigma_b$ and show that all numbers of the | |
| form $[x_1L_1^*x_2L_2^*\dotsm x_mL_m^*]_b$ have a divisor. | |
| \begin{example} | |
| Take $(L_1, L_2, L_3) \coloneqq (\{ \0 \},\{\0\},\emptyset)$ and $(x_1,x_2, x_3) \coloneqq (\9, \8, \1)$. | |
| Since $9$ divides $981$, $9081$, and $9801$, it follows that $9$ | |
| divides every number with base-$10$ representation | |
| of the form $\9\0^*\8\0^*\1$. | |
| \end{example} | |
| More generally, if a single divisor doesn't work for every number, | |
| Lemma~\ref{lemtwo} can also be applied in the case where all numbers | |
| of the form $[x_1L_1^*\dotsm x_i(L_i^2)^*\dotsm x_mL_m^*]_b$ have one | |
| divisor, and all numbers of the form | |
| $[x_1L_1^*\dotsm x_iL_i(L_i^2)^*\dotsm x_mL_m^*]_b$ have another divisor. | |
| \begin{example} | |
| Let $b \coloneqq 11$. | |
| Since $3$ divides each of $[\4\4\A\1]_b$, | |
| $[\4\4\A\1\1\1]_b$, | |
| $[\4\4\0\A\1]_b$ , | |
| it follows that | |
| every number of the form $[\4\4\0^*(\1\1)^*\1]_b$ is composite. | |
| Since $2$ divides each of $[\4\4\A\1\1]_b$, | |
| $[\4\4\A\1\1\1\1]_b$, $[\4\4\0\A\1\1]_b$, we know | |
| every number of the form $[\4\4\0^*(\1\1)^*\1\1]_b$ is composite. | |
| It follows that all numbers of the form $[\4\4\0^*\A\1^*\1]_b$ are composite. | |
| \end{example} | |
| Lemma~\ref{lemtwo} can also be applied to the case when all even-length | |
| strings under consideration have one divisor, and all the odd-length | |
| strings have another divisor. One such case is, for example, | |
| if numbers of the form | |
| $[x_1(L_1^2)^*x_2(L_2^2)^*x_3]_b$ and $[x_1 L_1(L_1^2)^*x_2L_2(L_2^2)^*x_3]_b$ | |
| have one divisor, and numbers of the form $[x_1L_1(L_1^2)^*x_2(L_2^2)^*x_3]_b$ | |
| and $[x_1(L_1^2)^*x_2L_2(L_2^2)^*x_3]_b$ have another divisor. | |
| \begin{example} | |
| Let $b \coloneqq 9$. | |
| Since $2$ divides each of $[\6]_b$, | |
| $[\1\1\6]_b$, | |
| $[\6\1\1]_b$, | |
| $[\1\6\1]_b$, | |
| $[\1\1\1\6\1]_b$, | |
| $[\1\6\1\1\1]_b$, every odd-length string of the form $\1^*\6\1^*$ is | |
| composite. | |
| Since $5$ divides each of $[\1\6]_b$, | |
| $[\1\1\1\6]_b$, | |
| $[\1\6\1\1]_b$, | |
| $[\6\1]_b$, | |
| $[\1\1\6\1]_b$, | |
| $[\6\1\1\1]_b$, | |
| every even-length string of the form $\1^*\6\1^*$ is composite. | |
| \end{example} | |
| \subsection{The second strategy} | |
| A second way of proving that families of the form $xL^*z$ do not contain a | |
| prime is via algebraic factorizations, such as a difference-of-squares | |
| factorization. | |
| \begin{lemma}\label{lemsquares} | |
| Let\/ $x$, $z\in\Sigma^*_b$, \updated{$y\in\Sigma_b$}, and let\/ $g\coloneqq\gcd([y]_b,b-1)$, | |
| $X\coloneqq([y]_b+(b-1)[x]_b)/g$, and\/ $Y\coloneqq(b^{\lvert{z}\rvert}[y]_b-(b-1)[z]_b)/g$. | |
| If\/ $b$, $X$, and\/ $Y$ are all squares and\/ | |
| $\sqrt{b^{\lvert z\rvert}X}-\sqrt{Y}>(b-1)/g$, then all numbers of the form\/ | |
| $[xy^*z]_b$ are composite. | |
| \end{lemma} | |
| \begin{proof} | |
| Evaluating the base-$b$ expansion of $xy^nz$, we get | |
| \begin{align*} | |
| [xy^nz]_b &= b^{\lvert z\rvert+n}[x]_b + b^{\lvert z\rvert}\frac{b^n-1}{b-1}[y]_b + [z]_b \\ | |
| &= \frac{b^{\lvert z\rvert+n}X-Y}{(b-1)/g} . | |
| %= \frac{(\sqrt{Xb^{\lvert z\rvert+n}}+\sqrt{Y})(\sqrt{Xb^{\lvert z\rvert+n}}-\sqrt{Y})}{(b-1)/g} | |
| \end{align*} | |
| Since $b$, $X$, and $Y$ are all squares the numerator factors as a | |
| difference of squares. By the size condition both factors are | |
| strictly larger than the | |
| denominator, and so the factorization is nontrivial. | |
| \end{proof} | |
| \begin{example} | |
| Let $b\coloneqq16$, $x\coloneqq\4$, $y\coloneqq\4$, and $z\coloneqq\1$. Then $g=1$, $X=8^2$, $Y=7^2$, and | |
| \[ [\4\4^n\1]_b = \frac{(4^{n+1}\cdot8+7)(4^{n+1}\cdot8-7)}{15} . \] | |
| Since $4\cdot8-7>15$, this factorization is nontrivial and no number of | |
| the form $[\4\4^*\1]_b$ is prime. | |
| \end{example} | |
| It is also possible to combine Lemma~\ref{lemtwo} with Corollary~\ref{cortwo} | |
| to construct a test which also applies to bases which are not squares. | |
| \begin{corollary} | |
| Using the same setup as in Lemma~\ref{lemtwo}, if\/ $b^{\abs{z}}X$ and\/ $Y$ | |
| are squares, $\sqrt{b^{\lvert z\rvert}X}-\sqrt{Y}>(b-1)/g$, and\/ | |
| $1<\gcd([xyz]_b,[xy^3z]_b)<[xz]_b$, then all numbers of the form\/ $[xy^*z]_b$ | |
| are composite. | |
| \end{corollary} | |
| \begin{proof} | |
| Say $n=2m$ is even. Then from the factorization in Lemma~\ref{lemtwo}, | |
| \[ [xy^nz]_b = \frac{(b^m\sqrt{b^{\abs{z}}X}+\sqrt{Y})(b^m\sqrt{b^{\abs{z}}X}-\sqrt{Y})}{(b-1)/g} \] | |
| which is nontrivial by the size condition. | |
| Alternatively, if $n$ is odd then as in Corollary~\ref{cortwo} we have that | |
| $\gcd([xyz]_b,[xy^3z]_b)$ divides $[xy^nz]_b$, and by the size condition this | |
| divisor is nontrivial. | |
| \end{proof} | |
| \begin{example} | |
| Let $b\coloneqq17$, $x\coloneqq\1\9$, $y\coloneqq\9$, and $z\coloneqq\9$. Then $g=1$, | |
| $b^{\abs{z}}X=85^2$, $Y=3^2$, and | |
| \[ [xy^{2n}z]_b = \frac{(17^n\cdot85+3)(17^n\cdot85-3)}{16} . \] | |
| Since $85-3>16$ this factorization is nontrivial. Furthermore, all numbers | |
| of the form $[xy^{2n+1}z]_b$ are even, so all numbers of the form | |
| $[\1\9\9^*\9]_b$ are composite. | |
| \end{example} | |
| Finally, we present a variant of Lemma~\ref{lemsquares} which applies to a | |
| difference-of-cubes factorization. | |
| \begin{lemma}\label{lemcubes} | |
| Let\/ $x$, $z\in\Sigma^*_b$, \updated{$y\in\Sigma_b$}, and let\/ $g\coloneqq\gcd([y]_b,b-1)$, | |
| $X\coloneqq([y]_b+(b-1)[x]_b)/g$, and\/ $Y\coloneqq(b^{\lvert{z}\rvert}[y]_b-(b-1)[z]_b)/g$. | |
| If\/ $b$, $X$, and\/ $Y$ are all cubes and\/ | |
| $\sqrt[3]{b^{\lvert z\rvert}X}-\sqrt[3]{Y}>(b-1)/g$, then all numbers of | |
| the form\/ | |
| $[xy^*z]_b$ are composite. | |
| \end{lemma} | |
| \begin{proof} | |
| As in Lemma~\ref{lemsquares}, we have | |
| \[ [xy^nz]_b = \frac{\bigl((b^{\abs{z}+n}X)^{1/3}-Y^{1/3}\bigr)\bigl((b^{\abs{z}+n}X)^{2/3}+(b^{\abs{z}+n}XY)^{1/3}+Y^{2/3}\bigr)}{(b-1)/g} . \] | |
| The second factor is at least as large as the first (except in the | |
| single case $b^{\abs{z}+n}X=1$ and $Y=-1$, which is not possible by | |
| construction of $X$ and~$Y$), | |
| so by the size condition both factors are strictly larger than the | |
| denominator, and the factorization is nontrivial. | |
| \end{proof} | |
| \begin{example} | |
| Let $b\coloneqq8$, $x\coloneqq\1$, $y\coloneqq\0$, and $z\coloneqq\1$. Then $g=7$, $X=1$, $Y=-1$, and | |
| \[ [\1\0^n\1]_b = (2^{n+1}+1)(4^{n+1}-2^{n+1}+1) . \] | |
| Since $2-(-1)>1$, this factorization is nontrivial and no number of the | |
| form $[\1\0^*\1]_b$ is prime. | |
| \end{example} | |
| \section{Our heuristic algorithm}\label{secalg} | |
| As previously mentioned, in practice to compute $M(L_b)$ one works with an underapproximation | |
| $M$ of $M(L_b)$ and an overapproximation $L$ of $L_b-\sup(M)$. | |
| One then refines such approximations until $L=\emptyset$ from which it follows that $M=M(L_b)$. | |
| For the initial approximation, note that every minimal prime in base $b$ with at least 4 | |
| digits is of the form $x Y^* z$, where $x\in\Sigma_b-\{\0\}$, $z\in\Sigma_b$, and | |
| \[ Y \coloneqq \Sigma_b - \set{y}{\text{$(p)_b\subw xyz$ for some prime $p$}} . \] | |
| Making use of this, our algorithm sets $M$ to be the set of base-$b$ representations | |
| of the minimal primes with at most $3$ digits (which can be found simply by brute force) | |
| and $L$ to be $\bigcup_{x,z}xY^*z$, as described above. | |
| All remaining minimal primes are members of $L$, so to find them we explore the families in~$L$. | |
| During this process, each family will be decomposed into possibly multiple other families. | |
| For example, a simple way of exploring the family $xY^*z$ where $Y\coloneqq\{y_1,\dotsc,y_n\}$ is to decompose it | |
| into the families $xY^*y_1z$, $\dotsc$, $xY^*y_nz$. If the smallest member (say $xy_iz$) | |
| of any such family happens to be prime, it can be added to $M$ and the family | |
| $xY^*y_iz$ removed from consideration. Furthermore, once $M$ has been updated it may be possible | |
| to simplify some families in $L$. In this case, $xY^*y_jz$ (for $j\neq i$) can be simplified to | |
| $x(Y-\{y_i\})^*y_jz$ since no minimal prime contains $xy_iz$ as a proper subword. | |
| Another way of decomposing %families | |
| the family $xY^*z$ is possible if one knows that a digit of $Y$ can only occur a certain number | |
| of times. For example, if $xy_iy_iz$ has a proper prime subword then the digit $y_i$ | |
| can occur at most once in any minimal prime of the form $xY^*z$, and we can split $xY^*z$ into | |
| the two families | |
| \[ x(Y-\{y_i\})^*z \qquad\text{and}\qquad x(Y-\{y_i\})^*y_i(Y-\{y_i\})^*z .\] | |
| Lastly, the family $xY^*z$ can be decomposed by considering digits of $Y$ which are mutually incompatible, | |
| i.e., they cannot occur simultaneously in a minimal prime. For example, if $xy_iy_jz$ and $xy_jy_iz$ ($i\neq j$) | |
| both have proper prime subwords then the digits $y_i$ and $y_j$ cannot occur simultaneously in any minimal | |
| prime of the form $xY^*z$, and we can split $xY^*z$ into the two families | |
| \[ x(Y-\{y_i\})^*z \qquad\text{and}\qquad x(Y-\{y_j\})^*z . \] | |
| Sometimes it is not possible to show two digits are mutually incompatible, but it is possible to know that | |
| one digit must appear before the other. For example, if $xy_iy_jz$ has a proper prime subword then the digit | |
| $y_j$ must appear before $y_i$ in any minimal prime of the form $xY^*z$, and we can replace $xY^*z$ | |
| with the family | |
| \[ x(Y-\{y_i\})^*(Y-\{y_j\})^*z . \] | |
| Similarly, if $xy_jyy_iz$ has a proper prime subword then we can split $xY^*yY^*z$ into the two families | |
| \[ x(Y-\{y_i\})^*yY^*z \qquad\text{and}\qquad xY^*y(Y-\{y_j\})^*z . \] | |
| We now formulate these arguments for the most general kind of family we need to consider, namely | |
| $x_1L_1^*\dotsm x_mL_m^*$. For simplicity, we only specify the decompositions as applying to | |
| $L_1\coloneqq\{y_1,\dotsc,y_n\}$, but it is straightforward to generalize these decompositions | |
| to also apply to $L_i$ for any $1\leq i\leq m$. | |
| \begin{lemma}\label{lemexplore} | |
| Every minimal prime of the form\/ $x_1L_1^*\dotsm x_mL_m^*$ must also be of the form\/ | |
| $x_1x_2L_2^*\dotsm x_mL_m^*$ or\/ $x_1y_iL_1^*x_2L_2^*\dotsm x_mL_m^*$ for some\/ $1\leq i\leq n$. | |
| \end{lemma} | |
| \begin{proof} | |
| Follows from the fact that $L_1^*=\{\epsilon\}\cup\bigcup_{i=1}^n y_iL_1^*$. | |
| \end{proof} | |
| Similarly, one can generalize Lemma~\ref{lemexplore} to apply to adding characters to the right of $L_1$ rather than | |
| to the left. | |
| \begin{example} | |
| The family $\1\0\{\0,\1\}^*\6\1^*\1$ splits into the three families $\1\0\6\1^*\1$, $\1\0\{\0,\1\}^*\0\6\1^*\1$, and $\1\0\{\0,\1\}^*\1\6\1^*\1$ | |
| by exploring $\{\0,\1\}^*$ on the right. | |
| \end{example} | |
| \begin{lemma}\label{lemsplit1} | |
| If\/ $x_1y_i^kx_2\dotsm x_m$ contains a prime proper subword (for some\/ $k\geq1$) then every minimal prime of the form\/ | |
| $x_1L_1^*\dotsm x_mL_m^*$ is of the form \[x_1(L_1-\{y_i\})^*(y_i(L_1-\{y_i\})^*)^jx_2L_2^*\dotsm x_mL_m^*\] | |
| for some\/ $0\leq j<k$. | |
| \end{lemma} | |
| \begin{proof} | |
| If $w\in x_1L_1^*\dotsm x_mL_m^*$ then $w\in x_1yx_2L_2^*\dotsm x_mL_m^*$ for some $y\in L_1^*$. | |
| If $y$ contains $k$ or more instances of $y_i$ then by assumption it follows that $w$ contains a proper prime subword, | |
| and therefore is not a minimal prime. So if $w$ is a minimal prime then $y$ must contain less than $k$ | |
| instances of $y_i$, i.e., $y$ must be of the form $(L_1-\{y_i\})^*(y_i(L_1-\{y_i\})^*)^j$ for some $0\leq j<k$, | |
| from which the claim follows. | |
| \end{proof} | |
| \begin{example} | |
| The string $\6\6\1$ represents a prime in base 9, and is a proper subword of $\1\0\6\6\1$. It follows that | |
| the family $\1\0\{\0,\1,\6\}^*\1$ splits into the families | |
| $\1\0\{\0,\1\}^*\1$ and $\1\0\{\0,\1\}^*\6\{\0,\1\}^*\1$ in base 9. | |
| \end{example} | |
| Lemma~\ref{lemsplit1} is especially useful when it can be applied with $k=1$, since in that case the family $x_1L_1^*\dotsm x_mL_m^*$ | |
| is replaced by a single strictly simpler family, in contrast to the other lemmas we will describe. | |
| \begin{lemma}\label{lemsplit2} | |
| If\/ $x_1y_iy_jx_2\dotsm x_m$ and\/ $x_1\updated{y_jy_i}x_2\dotsm x_m$ contain prime proper subwords (where\/ $i\neq j$) then | |
| every minimal prime of the form\/ $x_1L_1^*\dotsm x_mL_m^*$ is of the form | |
| \[x_1(L_1-\{y_i\})^*x_2L_2^*\dotsm x_mL_m^* \qquad\text{or}\qquad x_1(L_1-\{y_j\})^*x_2L_2^*\dotsm x_mL_m^* . \] | |
| \end{lemma} | |
| \begin{proof} | |
| If $w\in x_1L_1^*\dotsm x_mL_m^*$ then $w\in x_1yx_2L_2^*\dotsm x_mL_m^*$ for some $y\in L_1^*$. | |
| If $y$ contains both $y_i$ and $y_j$ then by assumption it follows that $w$ contains a proper prime subword, | |
| and therefore is not a minimal prime. So if $w$ is a minimal prime then $y$ cannot contain $y_i$ and $y_j$ | |
| simultaneously, i.e., $y$ must be of the form $(L-\{y_i\})^*$ or $(L-\{y_j\})^*$, from which the claim follows. | |
| \end{proof} | |
| \begin{example} | |
| The string $\4\6\1\1$ represents a prime in base 8, and is a proper subword of $\4\4\6\4\1\1$ and $\4\4\4\6\1\1$. | |
| It follows that the family $\4\4\{\4,\6\}^*\1\1$ splits into the families | |
| $\4\4\4^*\1\1$ and $\4\4\6^*\1\1$ in base 8. | |
| \end{example} | |
| \begin{lemma}\label{lemsplit2b} | |
| If\/ $x_1y_iy_jx_2\dotsm x_m$ contains a prime proper subword (where\/ $i\neq j$) then every minimal prime of the form\/ | |
| $x_1L_1^*\dotsm x_mL_m^*$ is of the form \[x_1(L_1-\{y_i\})^*(L_1-\{y_j\})^*x_2L_2^*\dotsm x_mL_m^* . \] | |
| \end{lemma} | |
| \begin{proof} | |
| If $w\in x_1L_1^*\dotsm x_mL_m^*$ then $w\in x_1yx_2L_2^*\dotsm x_mL_m^*$ for some $y\in L_1^*$. | |
| If $y$ contains $y_i$ before $y_j$ then by assumption it follows that $w$ contains a proper prime subword, | |
| and therefore is not a minimal prime. So if $w$ is a minimal prime then $y$ cannot contain $y_i$ before~$y_j$, | |
| i.e., $y$ must be of the form $(L-\{y_i\})^*(L-\{y_j\})^*$, from which the claim follows. | |
| \end{proof} | |
| %\begin{example} | |
| %The string $\1\0$ represents a prime in base 11, and is a proper subword of $\9\1\0\1$. | |
| %It follows that the family $\9\{\0,\1\}^*\1$ splits into the family $\9\0^*\1^*\1$ in base 11. | |
| %\end{example} | |
| \begin{example} | |
| The string $\1\0$ represents a prime in base 11, and is a proper subword of $\9\0\1\0\1$. | |
| It follows that the family $\9\0\{\0,\1,\9\}^*\1$ splits into the family $\9\0\{\0,\9\}^*\{\1,\9\}^*\1$ in base 11. | |
| \end{example} | |
| \begin{lemma}\label{lemsplit2c} | |
| If\/ $x_1y_ix_2y_{2,j}x_3\dotsm x_m$ contains a prime proper subword (where\/ $y_{2,j}\in L_2$) then every minimal prime of the form\/ | |
| $x_1L_1^*\dotsm x_mL_m^*$ is of the form | |
| \[x_1(L_1-\{y_i\})^*x_2L_2^*\dotsm x_mL_m^* \qquad\text{or}\qquad x_1L_1x_2(L_2-\{y_{2,j}\})^*x_3L_3^*\dotsm x_mL_m^* . \] | |
| \end{lemma} | |
| \begin{proof} | |
| If $w\in x_1L_1^*\dotsm x_mL_m^*$ then $w\in x_1yx_2y'x_3L_3^*\dotsm x_mL_m^*$ for some $y\in L_1^*$ and $y'\in L_2^*$. | |
| If $y$ contains $y_i$ and $y'$ contains $y_{2,j}$ then by assumption it follows that $w$ contains a proper prime subword, | |
| and therefore is not a minimal prime. So if $w$ is a minimal prime then either $y$ cannot contain $y_i$ or $y'$ cannot | |
| contain $y_{2,j}$, i.e., $y$ is of the form $(L_1-\{y_i\})^*$ and $y'$ is of the form $L_2^*$, or $y$ is of the form | |
| $L_1^*$ and $y'$ is of the form $(L_2-\{y_{2,j}\})^*$, from which the claim follows. | |
| \end{proof} | |
| \begin{example} | |
| The string $\6\0\4\1\1$ represents a prime in base 8, and is a proper subword of $\6\0\4\1\0\1$. | |
| It follows that the family $\6\0\{\0,\4\}^*\1\0^*\1$ splits into the families | |
| $\6\0\0^*\1\0^*\1$ and $\6\0\{\0,\4\}^*\1\1$ in base 8. | |
| \end{example} | |
| We call families of the form $xy^*z$ (where $x$, $z\in\Sigma_b^*$ and $y\in\Sigma_b$) \emph{simple} families. | |
| Our algorithm then proceeds as follows: | |
| \begin{enumerate} | |
| \item $M\coloneqq\{\text{minimal primes in base $b$ of length $\leq3$}\}$ \\ | |
| $L\coloneqq\bigcup_{x,z\in\Sigma_b}xY^*z$, where $x\neq\0$ and $Y$ is the set of digits $y$ such that $xyz$ has no subword in $M$ | |
| \item While $L$ contains non-simple families: | |
| \begin{enumerate} | |
| \item Explore each family of $L$ by applying Lemma~\ref{lemexplore}, and update $L$. | |
| \item Examine each family of $L$: | |
| \begin{enumerate} | |
| \item Let $w$ be the shortest string in the family. If $w$ has a subword in $M$, then remove the family from $L$. | |
| If $w$ represents a prime, then add $w$ to $M$ and remove the family from $L$. | |
| \item If possible, simplify the family by applying Lemma~\ref{lemsplit1} with $k=1$. | |
| \item Using the techniques of Section~\ref{seclemmas}, check if the family can be proven to only contain | |
| composites, and if so then remove the family from $L$. | |
| \end{enumerate} | |
| \item Apply Lemmas~\ref{lemsplit1}, \ref{lemsplit2}, \ref{lemsplit2b}, and~\ref{lemsplit2c} to the families of $L$ as much | |
| as possible and update $L$; after each split examine the new families as in~(b). | |
| \end{enumerate} | |
| \end{enumerate} | |
| The process of exploring/examining/splitting a family can be concisely expressed in a tree of decompositions. | |
| A sample tree of decompositions, for the family $\1\{\0,\1,\6\}^*\1$ in base~9, is displayed in Figure~\ref{decomptree}. | |
| When applying Lemma~\ref{lemexplore} on a family with only one nonempty $L_i$ we don't show the first decomposition | |
| (simply removing $L_i^*$) since that results in the shortest word in the family, which is always implicitly | |
| checked for primality/minimal subwords at each step anyway. | |
| \subsection{Implementation} | |
| An implementation of the above algorithm was written in C using the GMP library~\cite{gmp} by the first author~\cite{code}. | |
| \updated{Independently, the second author~\cite{devillers} wrote an implementation in C++ using the MIRACL library~\cite{miracl} using the same ideas and derived the same results (with a less extensive search for primes in the simple families, but with results for bases $b\leq50$).} | |
| Note that when exploring the families in line~(a), one should not always apply Lemma~\ref{lemexplore} directly as stated by adding | |
| characters to the left of $L_1$; in our implementation we alternate between adding characters | |
| to the left and right of $L_{1+k\bmod m}$, where $k$ is the number of times we have previously passed through line~(a). | |
| Also, the application of Lemma~\ref{lemsplit2b} tended to initially | |
| produce an excess number of cases, so it was useful to only apply it following a certain number of iterations (6 in our implementation). | |
| It also led to duplicate families after simplifying, for example families of the form $xy^*y^ny^*z$ for varying $n$, | |
| but these could be detected and removed. | |
| At the conclusion of the algorithm described, $L$ will consist of simple families (of the form $xy^*z$) which have not yet | |
| yielded a prime, but for which there is no obvious reason why there can't be a prime of such a form. | |
| In such a case, the only way to proceed is to test the primality of larger and larger numbers of such form and hope | |
| a prime is eventually discovered. | |
| The numbers in simple families are of the form $(ab^n+c)/d$ for some fixed $a$, $b$, $c$, $d$ where $d\mid ab^n+c$ for all $n$. | |
| Except in the special case $c=\pm1$ and $d=1$, when $n$ is large the known primality tests for such a number are too inefficient to run. | |
| In this case one must resort to a probable primality test such as a Miller--Rabin test, unless a divisor of the number can be found. | |
| Since we are testing many numbers in an exponential sequence, it is possible to use a sieving process to find divisors | |
| rather than using trial division. | |
| To do this, we made use of Geoffrey Reynolds' | |
| {\tt srsieve} software~\cite{srsieve}. | |
| This program uses the baby-step giant-step | |
| algorithm to find all primes $p$ which divide $ab^n+c$ where $p$ and $n$ lie in a specified range. Since this program cannot | |
| handle the general case $(ab^n+c)/d$ when $d>1$ we only used it to sieve the sequence $ab^n+c$ for primes $p\nmid d$, and initialized the list of candidates to not include $n$ for which there is some prime $p\mid d$ for which $p\mid(ab^n+c)/d$. | |
| The program had to be modified slightly to remove a check which would prevent it from running in the case when $a$, $b$, and $c$ were all odd (since then $2\mid ab^n+c$\updated{, but $2$ may not divide $(ab^n+c)/d$}). | |
| Once the numbers with small divisors had been removed, it remained to test the remaining numbers using a probable primality test. | |
| For this we used the software {\tt LLR} by Jean Penn\'e~\cite{llr}. | |
| Although undocumented, it is possible to run this program on numbers of the form $(ab^n+c)/d$ when $d\neq1$, so this program required no modifications. | |
| A script was also written which allowed one to run {\tt srsieve} while | |
| {\tt LLR} was testing the remaining candidates, so that when a divisor | |
| was found by {\tt srsieve} on a number which had not yet been tested by | |
| {\tt LLR} it would be removed from the list of candidates. | |
| In the cases where the elements of $M(L_b)$ could be proven prime rigorously, | |
| we employed {\tt PRIMO} | |
| by Marcel Martin~\cite{primo}, | |
| an elliptic curve primality proving implementation. | |
| \begin{landscape}\begin{figure}\begin{tiny}\[ \begin{tikzpicture} | |
| \node{$\1\{\0,\1,\6\}^*\1$}[sibling distance=8cm] | |
| child{node{$\1\0\{\0,\1,\6\}^*\1$}[sibling distance=3.5cm] | |
| child{node{$\1\0\{\0,\1\}^*\1$}[sibling distance=1.75cm] | |
| child{node{$\1\0\{\0,\1\}^*\0\1$} | |
| child{node[label=below:{divisible by 2}]{$\1\0\0^*\0\1$}} | |
| child{node[label=below:{subword $\1\0\1\1$}]{$\1\0\0^*1\0^*\0\1$}}} | |
| child{node[label=below:{prime $\1\0\1\1$}]{$\1\0\{\0,\1\}^*\1\1$}}} | |
| child{node{$\1\0\{\0,\1\}^*\6\{\0,\1\}^*\1$}[sibling distance=1.75cm] | |
| child{node{$\1\0\{\0,\1\}^*\6\1^*\1$} | |
| child{node{$\1\0\{\0,\1\}^*\0\6\1^*\1$} | |
| child{node[label=below:{divisible by 8}]{$\1\0\0^*\0\6\1$}}} | |
| child{node[label=below:{subword $\1\0\1\1$}]{$\1\0\{\0,\1\}^*\1\6\1^*\1$}} | |
| child{node{$\1\0\6\1^*\1$} | |
| child{node[label=below:{not prime}]{$\1\0\6\1$}}}}}} | |
| child{node{$\1\1\{\0,\1,\6\}^*\1$}[sibling distance=4.5cm] | |
| child{node{$\1\1\{\0,\1,\6\}^*\1$}[sibling distance=1.75cm] | |
| child{node{$\1\1\{\0,\1\}^*\1$} | |
| child{node[label=below:{prime $\1\1\0\1$}]{$\1\1\{\0,\1\}^*\0\1$}} | |
| child{node{$\1\1\{\0,\1\}^*\1\1$} | |
| child{node[label=below:{difference of squares}]{$\1\1\1^*\1\1$}}}}} | |
| child{node{$\1\1\{\0,\1\}^*\6\{\0,\1\}^*\1$}[sibling distance=2.25cm] | |
| child{node[label=below:{subword $\1\1\0\1$}]{$\1\1\{\0,\1\}^*\0\6\{\0,\1\}^*\1$}} | |
| child{node{$\1\1\{\0,\1\}^*\1\6\{\0,\1\}^*\1$} | |
| child{node[label=below:{divisible by 2 or 5}]{$\1\1\1^*\1\6\1^*\1$}}} | |
| child{node{$\1\1\6\{\0,\1\}^*\1$} | |
| child{node[label=below:{divisible by 2 or 5}]{$\1\1\6\1^*\1$}}}}} | |
| child{node{$\1\6\{\0,\1,\6\}^*\1$}[sibling distance=5cm] | |
| child{node[label=below:{divisible by 2 or 5}]{$\1\6\1^*\1$}[sibling distance=3cm]}}; | |
| \end{tikzpicture} \]\end{tiny}\caption{Tree of decompositions for $\1\{\0,\1,\6\}^*\1$ in base $9$.}\label{decomptree}\end{figure}\end{landscape} | |
| \section{Results} | |
| A summary of the results of our algorithm is presented in | |
| Figure~\ref{resultsfig}; it was able to completely solve all bases up to 30 \updated{except for | |
| 17, 19, 21, and those between 25 and 29}. | |
| %bases 26, 28, and the odd bases larger than 16. | |
| The results in base~29 required some additional strategies, as described in Section~\ref{addstrat}. | |
| %For the bases up to~30 | |
| \begin{figure}\[\begin{array}{c@{\qquad}c@{\qquad}c@{\qquad}c@{\qquad}c} | |
| b & \lvert M(L_b)\rvert & \max\limits_{x\in M(L_b)}\lvert x\rvert & \parbox{5em}{\centering\# unsolved\\families} & \parbox{5em}{\centering searched height} \\ \hline | |
| 2 & 2 & 2 & 0 & - \\ | |
| 3 & 3 & 3 & 0 & - \\ | |
| 4 & 3 & 2 & 0 & - \\ | |
| 5 & 8 & 5 & 0 & - \\ | |
| 6 & 7 & 5 & 0 & - \\ | |
| 7 & 9 & 5 & 0 & - \\ | |
| 8 & 15 & 9 & 0 & - \\ | |
| 9 & 12 & 4 & 0 & - \\ | |
| 10 & 26 & 8 & 0 & - \\ | |
| 11 & 152 & 45 & 0 & - \\ | |
| 12 & 17 & 8 & 0 & - \\ | |
| 13\mathrlap{^*} & 228 & 32021 & 0 & - \\ | |
| 14 & 240 & 86 & 0 & - \\ | |
| 15 & 100 & 107 & 0 & - \\ | |
| 16 & 483 & 3545 & 0 & - \\ | |
| 17\mathrlap{^*} & \geq1279 & \geq111334 & 1 & \updated{967000} \\ | |
| 18 & 50 & 33 & 0 & - \\ | |
| 19\mathrlap{^*} & \geq3462 & \geq110986 & 1 & \updated{686000} \\ | |
| 20 & 651 & 449 & 0 & - \\ | |
| 21\mathrlap{^*} & \geq2600 & \geq479150 & 1 & \updated{475000} \\ | |
| 22 & 1242 & 764 & 0 & - \\ | |
| 23\mathrlap{^*} & \updated{6021} & \updated{800874} & \updated{0} & \updated{-} \\ | |
| 24 & 306 & 100 & 0 & - \\ | |
| 25\mathrlap{^*} & \geq17597 & \geq136967 & 12 & \updated{292000} \\ | |
| 26 & \geq5662 & \geq8773 & 2 & \updated{461000} \\ | |
| 27\mathrlap{^*} & \geq17210 & \geq109006 & 5 & \updated{355000} \\ | |
| 28\mathrlap{^*} & \geq5783 & \geq94538 & 1 & \updated{524000} \\ | |
| 29\mathrlap{^*} & \updated{\geq57283} & \updated{\geq174240} & \updated{14} & \updated{233000} \\ | |
| 30 & 220 & 1024 & 0 & - | |
| \end{array}\] | |
| \begin{center}$^*$Data based on results of probable primality tests.\end{center} | |
| \caption{Summary of results for the prime numbers for each base $b$.} | |
| \label{resultsfig} | |
| \end{figure} | |
| For every solved base, we give the size $|M(L_b)|$ and the ``width'' $\max_{x \in M(L_b)} |x|$ of | |
| the corresponding family. | |
| For the unsolved bases, we give a lower bound on the size and width of $M(L_b)$, | |
| along with the number of families of the form | |
| $xy^*z$ for which no prime member could be found, nor could the family be | |
| ruled out as only containing composites. Since such simple families can contain at | |
| most one minimal prime, an upper bound on $\abs{M(L_b)}$ is given by the | |
| sum of the lower bound for $\abs{M(L_b)}$ and the number of unsolved families. | |
| Additionally, we give the height to which the simple families were searched | |
| for primes; if there are any more primes in $M(L_b)$ they must have at least | |
| this many digits in base $b$. | |
| The family $\8\0^*\1$ in base~23, corresponding to the | |
| generalized Proth numbers $8\cdot23^n+1$, was already known to be prime | |
| for minimal $n=119215$ in the process | |
| of solving the generalized Sierpi\'nski conjecture in base 23~\cite{crus}. | |
| \updated{The largest probable prime we found was the number $\mathtt{9E}^{800873}$ | |
| (expressed as a base~23 string) | |
| or $(106\cdot23^{800873}-7)/11$ in standard notation. | |
| It contains 1090573 decimal digits and at the time of discovery was the tenth | |
| largest known probable prime according to Henri and Renaud Lifchitz's ranking~\cite{lifc}.} | |
| \subsection{Unsolved families} | |
| \updated{There were 37 families for which we were unable to determine if they contain a prime or not. | |
| They are explicitly described in Figure~\ref{unsolved} in both their ``base $b$'' string representation | |
| $xy^*z$ and ``algebraic'' form $(ab^n+c)/d=[xy^nz]_b$. The numbers $a$ and $c$ are given in their factorized form | |
| so as to help spot algebraic factorizations.} | |
| \begin{figure}%[H] | |
| \updated{\begin{center}\begin{tabular}{ccc} | |
| Base & Family & Algebraic form \\ \hline | |
| 17 & $\mathtt{F19^*}$ & $(5\cdot821\cdot17^n-3^2)/16$ \\ | |
| 19 & $\mathtt{EE16^*}$ & $(2^2\cdot13\cdot307\cdot19^n-1)/3$ \\ | |
| %21 & $\mathtt{CF^*0K}$ & $(3\cdot17\cdot21^{n+2}-11\cdot113)/4$ \\ | |
| 21 & $\mathtt{G0^*FK}$ & $2^4\cdot21^{n+2}+5\cdot67$ \\ | |
| %23 & $\mathtt{9E^*}$ & $(2\cdot53\cdot23^n-7)/11$ \\ | |
| 25 & $\mathtt{6MF^*9}$ & $(1381\cdot25^{n+1}-53)/8$ \\ | |
| & $\mathtt{CM1^*}$ & $(59\cdot131\cdot25^n-1)/24$ \\ | |
| & $\mathtt{EE1^*}$ & $(8737\cdot25^n-1)/24$ \\ | |
| & $\mathtt{E1^*E}$ & $(337\cdot25^{n+1}+311)/24$ \\ | |
| & $\mathtt{EFO^*}$ & $2\cdot3\cdot61\cdot25^n-1$ \\ | |
| & $\mathtt{F1^*F1}$ & $(19^2\cdot25^{n+2}+37\cdot227)/24$ \\ | |
| & $\mathtt{F0^*KO}$ & $3\cdot5\cdot25^{n+2}+2^2\cdot131$ \\ | |
| & $\mathtt{F0K^*O}$ & $(5\cdot11\cdot41\cdot25^{n+1}+19)/6$ \\ | |
| & $\mathtt{LOL^*8}$ & $(53\cdot83\cdot25^{n+1}-3\cdot37)/8$ \\ | |
| & $\mathtt{M1^*F1}$ & $(23^2\cdot25^{n+2}+37\cdot227)/24$ \\ | |
| & $\mathtt{M10^*8}$ & $19\cdot29\cdot25^{n+1}+2^3$ \\ | |
| & $\mathtt{OL^*8}$ & $(199\cdot25^{n+1}-3\cdot37)/8$ \\ | |
| 26 & $\mathtt{A^*6F}$ & $(2\cdot26^{n+2}-7\cdot71)/5$ \\ | |
| & $\mathtt{I^*GL}$ & $(2\cdot3^2\cdot26^{n+2}-11\cdot113)/25$ \\ | |
| 27 & $\mathtt{80^*9A}$ & $2^3\cdot27^{n+2}+11\cdot23$ \\ | |
| & $\mathtt{999G^*}$ & $(101\cdot877\cdot27^n-2^3)/13$ \\ | |
| & $\mathtt{CL^*E}$ & $(3^2\cdot37\cdot27^{n+1}-7\cdot29)/26$ \\ | |
| & $\mathtt{EI^*F8}$ & $(191\cdot27^{n+2}-2^3\cdot149)/13$ \\ | |
| & $\mathtt{F^*9FM}$ & $(3\cdot5\cdot27^{n+3}-113557)/26$ \\ | |
| 28 & $\mathtt{OA^*F}$ & $(2\cdot7\cdot47\cdot28^{n+1}+5^3)/27$ \\ | |
| 29 & $\mathtt{1A^*}$ & $(19\cdot29^n-5)/14$ \\ | |
| & $\mathtt{68L0^*6}$ & $7\cdot757\cdot29^{n+1}+2\cdot3$ \\ | |
| & $\mathtt{AMP^*}$ & $(8761\cdot29^n-5^2)/28$ \\ | |
| & $\mathtt{C^*FK}$ & $(3\cdot29^{n+2}+2\cdot331)/7$ \\ | |
| & $\mathtt{F^*OPF}$ & $(3\cdot5\cdot29^{n+3}+139\cdot1583)/28$ \\ | |
| & $\mathtt{FKI^*}$ & $(6379\cdot29^n-3^2)/14$ \\ | |
| & $\mathtt{F^*OP}$ & $(3\cdot5\cdot29^{n+2}+7573)/28$ \\ | |
| & $\mathtt{LP09^*}$ & $(31\cdot16607\cdot29^n-3^2)/28$ \\ | |
| & $\mathtt{OOPS^*A}$ & $2\cdot10453\cdot29^{n+1}-19$ \\ | |
| %& $\mathtt{O0^*FPL}$ & $2^3\cdot3\cdot29^{n+3}+31\cdot431$ \\ | |
| & $\mathtt{PC^*}$ & $(2\cdot89\cdot29^n-3)/7$ \\ | |
| & $\mathtt{PPPL^*O}$ & $(87103\cdot29^{n+1}+3^2)/4$ \\ | |
| & $\mathtt{Q^*GL}$ & $(13\cdot29^{n+2}-3\cdot1381)/14$ \\ | |
| & $\mathtt{Q^*LO}$ & $(13\cdot29^{n+2}-19\cdot109)/14$ \\ | |
| & $\mathtt{RM^*G}$ & $(389\cdot29^{n+1}-5\cdot19)/14$ | |
| \end{tabular}\end{center} | |
| \caption{The unsolved families listed in their ``base $b$'' and ``algebraic'' representations.}} | |
| \label{unsolved} | |
| \end{figure} | |
| \subsection{\boldmath Primes of the form $4n+1$ and $4n+3$} | |
| The heuristic algorithm described in Section~\ref{secalg} may also be easily modified to apply to recursive subsets of prime numbers, | |
| e.g., those congruent to $1\bmod4$ (alternatively, $3\bmod4$). The modifications necessary are to the initialization | |
| of $M$ in line~1, and to update the check ``$w$ represents a minimal prime'' to also check that $w$ is in the subset | |
| under consideration. \updated{When searching for minimal primes of the form $4n+1$, it was necessary to remove families only containing | |
| numbers of the form $4n+3$, and vice versa. A modified Lemma~\ref{lemone} can be used to check this, e.g., | |
| $[x(L^0\cup L)z]_b$ are all congruent to $c\bmod d$ implies that $[xL^*z]_b$ are all congruent to $c\bmod d$.} | |
| The lemmas of Section~\ref{seclemmas} can be used without modification, since a family which contains only composites | |
| of course does not contain any primes of a special form, either. For simplicity the lemmas of Section~\ref{secalg} are stated | |
| to apply when the minimal set consists of primes, but actually apply to general minimal sets and also need no modification. | |
| When run on the primes of the form $4n+1$ represented in base 10, our implementation successfully computes the minimal set | |
| consisting of 146 elements, the largest of which contains 79 digits. The | |
| \textit{Encyclopedia of Integer Sequences}~\cite{oeis} | |
| contains the elements for this minimal set in entry \seqnum{A111055}. Prior to our work, the longest 11 elements were missing | |
| from this listing. | |
| When run on the primes of the form $4n+3$ represented in base 10, our implementation successfully computes the minimal set | |
| consisting of 113 elements, the largest of which contains 19153 digits. \updated{Because of its size | |
| this number is not easily proven prime, but Fran\c{c}ois Morain successfully produced a primality certificate for it~\cite{mora}.} | |
| The second-largest number has 50 digits, so the remaining elements | |
| are easily proven prime. The \textit{Encyclopedia of Integer Sequences}~\cite{oeis} | |
| contains the elements for this minimal set in entry \seqnum{A111056}. Prior to our work, the longest 10 elements were missing | |
| from this listing. | |
| \section{Some additional strategies}\label{addstrat} | |
| The strategies discussed so far suffice to restrict the possible forms of minimal primes to a finite number of simple families in all bases $2\leq b\leq 28$. | |
| However, as $b$ increased, in addition to the calculations becoming more costly, | |
| it was found to be necessary to use increasingly complicated strategies. | |
| We now describe some additional strategies which we found sufficient to solve all non-simple families in base~29. | |
| %We now describe some additional strategies which we found sufficient to solve the problem in base~29 (up to a finite number of simple families). | |
| %our implementation finds $55{,}817$ minimal primes and $2{,}565$ unsolved families). | |
| %We now describe some additional strategies which we used in base~29 to find $57{,}252$ minimal primes | |
| %and 45 unsolved families (with no primes with less than $10{,}000$ digits). | |
| \begin{lemma} | |
| If every number of the form\/ $x_1(L_1-\{y_i\})^*x_2L_2^*\dotsm x_mL_m^*$ is composite, then | |
| every minimal prime of the form\/ $x_1L_1^*\dotsm x_mL_m^*$ must also be of the form\/ | |
| $x_1L_1^*y_iL_1^*x_2L_2^*\dotsm x_mL_m^*$. | |
| \end{lemma} | |
| \begin{proof} | |
| If $w\in x_1L_1^*\dotsm x_mL_m^*$ then $w\in x_1yx_2L_2^*\dotsm x_mL_m^*$ for some $y\in L_1^*$. If $y$ does not contain a $y_i$ | |
| then $w$ is composite by assumption. Therefore if $w$ is a prime then $y$ contains a $y_i$, i.e., $y\in L_1^*y_iL_1^*$, from which the | |
| result follows. | |
| (Note that one could improve this result via $y\in(L_1-\{y_i\})^*y_iL_1^*\cup L_1^*y_i(L_1-\{y_i\})^*$, but this was found to be unnecessary for our purposes.) | |
| \end{proof} | |
| \begin{example} | |
| The numbers represented by the family $\mathtt{F}\{\0,\9,\mathtt{F}\}^*\mathtt{F}$ in base~29 are divisible by $3$, | |
| so every minimal prime of the form $\mathtt{F}\{\0,\9,\mathtt{F},\mathtt{P}\}^*\mathtt{F}$ must also be of the form | |
| $\mathtt{F}\{\0,\9,\mathtt{F},\mathtt{P}\}^*\mathtt{P}\{\0,\9,\mathtt{F},\mathtt{P}\}^*\mathtt{F}$. | |
| \end{example} | |
| \begin{lemma} | |
| If\/ $x_1y_iy_jy_ix_2\dotsm x_m$ contains a prime proper subword (where\/ $i\neq j$) then every minimal prime of the form\/ | |
| $x_1L_1^*\dotsm x_mL_m^*$ is of the form \[x_1(L_1-\{y_i\})^*(L_1-\{y_j\})^*(L_1-\{y_i\})^*x_2L_2^*\dotsm x_mL_m^* . \] | |
| \end{lemma} | |
| \begin{proof} | |
| If $w\in x_1L_1^*\dotsm x_mL_m^*$ then $w\in x_1yx_2L_2^*\dotsm x_mL_m^*$ for some $y\in L_1^*$. | |
| If $y$ contains $y_iy_jy_i$ then by assumption it follows that $w$ contains a proper prime subword, | |
| and therefore is not a minimal prime. So if $w$ is a minimal prime then either $y$ does not contain | |
| a $y_j$, or $y$ contains a $y_j$ and all $y_i$s in $y$ either come before or after the $y_j$. | |
| In each case, $y$ is of the form $(L-\{y_i\})^*(L-\{y_j\})^*(L-\{y_i\})^*$, from which the claim follows. | |
| \end{proof} | |
| \begin{example} | |
| The string $\mathtt{QLQ}$ represents a prime in base~29, and is a proper subword of $\mathtt{LQLQL}$. | |
| It follows that the family $\mathtt{L}\{\mathtt{L},\mathtt{Q}\}^*\mathtt{L}$ splits into the family $\mathtt{L}\mathtt{L}^*\mathtt{Q}^*\mathtt{L}^*\mathtt{L}$ in base~29. | |
| \end{example} | |
| This rule may also be generalized to apply to the case when $x_1y_iy_jy_iy_jx_2\dotsm x_m$ contains a prime proper subword (where $i\neq j$). | |
| \begin{example} | |
| The string $\mathtt{LL9L9LQL}$ represents a prime in base~29. It follows that the family $\mathtt{LL\{\mathtt{9},\mathtt{L}\}^*\mathtt{Q}^*\mathtt{QL}}$ splits into the family $\mathtt{LL\mathtt{L}^*\mathtt{9}^*\mathtt{L}^*\mathtt{9}^*\mathtt{Q}^*\mathtt{QL}}$ in base~29. | |
| \end{example} | |
| In Section~\ref{subsecfirststrat} we described a number of strategies for determining if every member of a family has a divisor, | |
| but for some families divisors exist which will not be found using those tests. The following lemma can help one discover when | |
| this is the case; in particular when every member of a family is divisible by some small prime. For a language $L$ we use the | |
| notation $[L]_b\bmod N$ to denote the finite set of residues $\set{[x]_b\bmod N}{x\in L}$. | |
| \begin{lemma}\label{lemextdiv} | |
| If\/ $1<\gcd(k,N)$ for every\/ $k\in[L]_b\bmod N$ and\/ $p\notin[L]_b$ for every prime\/ $p$ which divides\/ $N$, then all numbers of the form\/ $[L]_b$ are composite. | |
| \end{lemma} | |
| \begin{proof} | |
| Let $x$ be an arbitrary member of $L$. Then $\gcd([x]_b,N)=\gcd([x]_b\bmod N,N)>1$ by assumption, so $[x]_b$ cannot be prime unless $[x]_b=\gcd([x]_b,N)$. But in that case $[x]_b$ would be a prime which divides $N$, and so $[x]_b\notin[L]_b$ by the second assumption, in contradiction to $x\in L$. | |
| \end{proof} | |
| To make use of this lemma, we need to be able to compute $[x_1L_1^*\dotsm x_mL_m^*]_b\bmod N$. To do this, we can make use of the following relations: | |
| \begin{align*} | |
| [Lx]_b \bmod N &= (b\cdot([L]_b\bmod N)+[x]_b)\bmod N \\ | |
| [L\{y_1,\dotsc,y_n\}]_b \bmod N &= \bigcup_{i=1}^n[Ly_i]_b\bmod N \\ | |
| [L\{y_1,\dotsc,y_n\}^*]_b \bmod N &= \bigcup_{i=0}^\infty[L\{y_1,\dotsc,y_n\}^i]_b\bmod N | |
| \end{align*} | |
| In the final case, the union may be taken to be finite over $i<l$ where $l$ is chosen such that $\bigcup_{i=0}^l[L\{y_1,\dotsc,y_n\}^i]_b\bmod N=\bigcup_{i=0}^{l-1}[L\{y_1,\dotsc,y_n\}^i]_b\bmod N$. | |
| Using these relations, we can compute $[x_1L_1^*\dotsm x_mL_m^*]_b\bmod N$ progressively, working left to right and starting from $[\emptyset]_b\bmod N = \{0\}$. To solve base~29 it was sufficient to use $N=2\cdot3\cdot5$. | |
| \begin{example} | |
| Let $b\coloneqq29$ and $N\coloneqq30$. %, and $L\coloneqq\mathtt{L}\1^*\6\1^*\mathtt{LK}^*\mathtt{K}$. | |
| Then $[\mathtt{L}\1^*\6\1^*\mathtt{LK}^*\mathtt{K}]_b\bmod N=\{4,5,6,14,15,16,25\}$, so if $k$ is in this set then $\gcd(k,N)\in\{2,5,6,15\}$ and $\gcd(k,N)>1$. Since $2$, $3$, $5\notin[\mathtt{L}\1^*\6\1^*\mathtt{LK}^*\mathtt{K}]_b$, by Lemma~\ref{lemextdiv} all numbers of the form $\mathtt{L}\1^*\6\1^*\mathtt{LK}^*\mathtt{K}$ are composite. | |
| \end{example} | |
| %The next result will allow to progress in the exploration of the repetitions of a language, %RD | |
| %since it will decompose a repetition in a part that is not interesting to be explored further, until no other rule may be used, and a part which may lead to reductions of the families to be exploited. | |
| %In the following, we shall denote by $strip(L)$ the string obtain from $L$ by deleting all the repetitions in it. | |
| %\begin{lemma} | |
| %Let $L=L1\{w_1,w_2,\ldots,w_n\}^*L2$ be some language of the kind we are interested in. | |
| %Let us decompose $\{w_1,w_2,\ldots,w_n\}$ in two disjoint sets: | |
| %$N$ (for no present interest to develop) and | |
| %$F$ (for to be further explored) - this is like a partition but $N$ or $F$ may be empty. | |
| %$F$ is the subset of $\{w_1,w_2,\ldots,w_n\}$ such that, for any $w_i\in F$, | |
| %for some $m$ in $M$, the longest prefix of m which is a subword of $strip(L_1)$ is smaller than | |
| %the longest prefix of $m$ which is a subword of $strip(L_1) w_i$; $N$ is the complement of $F$. | |
| %If $N\neq \emptyset$, one may explore the languages $L_1 N^* L_2$ and | |
| %$L_1 N^*w_i\{w_1,w_2,\ldots,w_n\}^* L_2$ for each $w_i\in F$. | |
| %\end{lemma} | |
| %This property is in some sense trivial; its true interest is that it is not useful to further explore $N^*$ until $M$ is enriched, or no other rule allows to simplify the handling. | |
| %The rule here presented allows to progress from left to right in the handling of repetitions, | |
| % but it is also possible to derive a similar decomposition for progressing from right to left. | |
| \section{Composite numbers} | |
| One can also consider the companion problem of determining the | |
| minimal elements for the composite numbers | |
| $\lbrace 4,6,8,9,10, 12, \ldots \rbrace $. | |
| Here, in contrast with the primes, we have | |
| \begin{theorem} | |
| The following decision problem is recursively solvable: given | |
| a base\/ $b$ and a DFA\/ $M$ accepting a language\/ $L$ of | |
| strings in a base-$b$ canonical representation, | |
| does\/ $M$ accept the base-$b$ representation of a composite | |
| number? | |
| \end{theorem} | |
| This follows immediately from | |
| \begin{theorem} | |
| Suppose\/ $L$ is a regular language, accepted by a | |
| deterministic finite automaton\/ $M$ of\/ $n$ states. Then | |
| if\/ $M$ accepts a composite number expressed in base\/ $b$, it must | |
| accept one whose base-$b$ representation has at most\/ | |
| $n(b^{2n} + 1)$ digits. | |
| \end{theorem} | |
| \begin{proof} | |
| If $L$ is finite, then the longest string accepted by $M$ has | |
| at most $n-1$ digits, and $n-1 < n(b^{2n} + 1)$. | |
| Otherwise $L$ is infinite. Then $M$ accepts a string of length | |
| $\ell$ with $n < \ell \leq 2n$ that can be pumped (as in the | |
| pumping lemma). That is, there exists $x \in L$ such that | |
| $x = uvw$ with $|uv| \leq n$. Then a classical proof of the | |
| non-regularity of the prime numbers | |
| (e.g., \cite[Example 3.2, p.\ 57]{HU79}) | |
| shows that either $x$ is the representation of a composite | |
| number (in which case $|x| \leq 2n \leq n(b^{2n} + 1)$) | |
| or $u v^p w$ is composite, where $p = [x]_b$. | |
| Our bound now follows. | |
| \end{proof} | |
| Write $S_b \coloneqq \set{(n)_b}{\text{$n \geq 4$ is composite}}$. | |
| For our purposes, the following theorem gives a better bound. | |
| \begin{theorem} | |
| Every minimal element of\/ $S_b$ is of length at most\/ $b+2$. | |
| \label{devi} | |
| \end{theorem} | |
| \begin{proof} | |
| Consider any word $w$ of $S_b$ of length $\geq b+3$. Since there are only | |
| $b$ distinct digits, some digit $d$ is repeated at least twice, | |
| so that $dd \subw w$. If $d > \1$, the number $[dd]_b$ is composite, | |
| as it is divisible by $[\1\1]_b$ but not equal to it. If $d = \0$, | |
| then some nonzero digit $c$ precedes it in $w$, so $c\0\0 \subw w$ | |
| and $[c\0\0]_b$ is divisible by $b^2$, which is composite. | |
| Finally, if no digit other than $\1$ is repeated, then | |
| $\1\1\1\1 \subw w$, and $[\1\1\1\1]_b = [\1\1]_b \cdot [\1\0\1]_b$, and hence | |
| is composite. | |
| \end{proof} | |
| Theorem~\ref{devi} turns the computation of the minimal elements for | |
| $S_b$ into a finite search. Our results are given in Figure~\ref{ttwo}. | |
| \begin{figure}\[\begin{array}{c@{\qquad}c@{\qquad}c} | |
| b & \lvert M(S_b)\rvert & \max\limits_{x\in M(S_b)}\lvert x\rvert \\ \hline | |
| 2 & 3 & 4 \\ | |
| 3 & 4 & 3 \\ | |
| 4 & 9 & 3 \\ | |
| 5 & 10 & 3 \\ | |
| 6 & 19 & 4 \\ | |
| 7 & 18 & 3\\ | |
| 8 & 26 & 3 \\ | |
| 9 & 28 & 2 \\ | |
| 10 & 32 & 3 \\ | |
| 11 & 32 & 3\\ | |
| 12 & 46 & 4 \\ | |
| 13 & 43 & 3 \\ | |
| 14 & 52 & 3 \\ | |
| 15 & 54 & 2 \\ | |
| 16 & 60 & 3 \\ | |
| 17 & 60 & 3 \\ | |
| 18 & 95 & 4 \\ | |
| 19 & 77 & 3 \\ | |
| 20 & 87 & 3 \\ | |
| 21 & 90 & 2 \\ | |
| 22 & 94 & 3 \\ | |
| 23 & 97 & 3 \\ | |
| 24 & 137 & 4 \\ | |
| 25 & 117 & 2 \\ | |
| 26 & 111 & 3 \\ | |
| 27 & 115 & 2 \\ | |
| 28 & 131 & 3 \\ | |
| 29 & 123 & 3 \\ | |
| 30 & 207 & 4 | |
| \end{array}\] | |
| \caption{Summary of results for the composite numbers for each base $b$.} | |
| \label{ttwo} | |
| \end{figure} | |
| \section{Conclusion and perspectives} | |
| We have (sometimes partially) solved the problem of finding the minimal elements for the prime numbers | |
| represented with the first bases. However, the problem rapidly becomes very complex: the number of members | |
| increases rapidly, some bases need sophisticated strategies, | |
| and some members are very long. | |
| Hence, to complete the cases where some (simple) families resisted our analysis, and to go further in the bases, | |
| we need better exploration strategies, and better primality tests for numbers expressed by a simple family in some base. | |
| Another problem left open is to delineate the asymptotic behaviour of the size and width of those sets $M(L_b)$ when $b$ increases. | |
| A rough estimation is that the size increases exponentially, but this increase is lower when the base has many different factors. | |
| To illustrate once more the difficulty of computing $M(L)$, we recall | |
| an open problem from \cite{Sh00}: | |
| \begin{problem} | |
| Let $L \coloneqq \lbrace {\tt 1, 2, 4, 8, 16, 32, 64, } \ldots \rbrace$, the | |
| base-$10$ representation of the powers of $2$. | |
| Is it the case that | |
| $$ M(L) = \lbrace \1, \2, \4, \8, \6\5\5\3\6 \rbrace ? $$ | |
| \end{problem} | |
| This would follow, for example, if we could prove that every power | |
| of $16$ greater than $65536$ contained at least one of the digits | |
| $\lbrace \1, \2, \4, \8 \rbrace$. But this seems beyond current capabilities. | |
| \section*{Acknowledgments} | |
| We are very grateful to | |
| Fran\c{c}ois Morain for proving the primality of | |
| $2 \cdot (10^{19153} - 1)/9 + 77$, which enabled us to complete the | |
| classification of the minimal elements of primes of the form $4n+3$ | |
| in base 10. | |
| \bibliographystyle{plain} | |
| \begin{thebibliography}{10} | |
| \raggedright | |
| \bibitem{crus} G. Barnes. | |
| \newblock Sierpinski conjectures and proofs, \newline | |
| \url{www.noprimeleftbehind.net/crus/Sierp-conjectures.htm}. | |
| \bibitem{code} C. Bright. | |
| \newblock MEPN implementation, | |
| \url{github.com/curtisbright/mepn}. | |
| \bibitem{Choi} S. L. G. Choi. | |
| \newblock Covering the set of integers by congruence | |
| classes of distinct moduli. | |
| \newblock \textit{Math. Comp.} {\bf 25} (1971), 885--895. | |
| \updated{\bibitem{devillers} R. Devillers. | |
| \newblock Results in searching for minimal primes wrt subword order, | |
| \url{github.com/RaymondDevillers/primes}.} | |
| \bibitem{Dick} L. E. Dickson. | |
| \newblock \textit{History of the Theory of Numbers}. | |
| \newblock Volume I: Divisibility and Primality. | |
| \newblock Chelsea Publishing Company, New York, 1952. | |
| \bibitem{gmp} T. Granlund et al. | |
| \newblock The GNU multiple precision arithmetic library, | |
| \url{gmplib.org}. | |
| \bibitem{GHK07} H. Gruber, M. Holzer, and M. Kutrib. | |
| \newblock The size of Higman-Haines sets. | |
| \newblock \textit{Theoret. Comput. Sci.} {\bf 387} (2007), 167--176. | |
| \bibitem{GHK09} H. Gruber, M. Holzer, and M. Kutrib. | |
| \newblock More on the size of Higman-Haines sets: effective constructions. | |
| \newblock \textit{Fundam. Inform.} {\bf 91} (2009), 105--121. | |
| \bibitem{Ha69} L. H. Haines. | |
| \newblock On free monoids partially ordered by embedding. | |
| \newblock \textit{J. Combinatorial Theory} {\bf 6} (1969), 94--98. | |
| \bibitem{Hi52} G. Higman. | |
| \newblock Ordering by divisibility in abstract algebras. | |
| \newblock \textit{Proc. London Math. Soc.} (3) {\bf 2} (1952), 326--336. | |
| \bibitem{HU79} J. E. Hopcroft and J. D. Ullman. | |
| \newblock \textit{Introduction to Automata Theory, Languages, and Computation}, | |
| Addison-Wesley, 1979. | |
| \updated{\bibitem{lifc} H. Lifchitz and R. Lifchitz. | |
| \newblock PRP records, | |
| \url{www.primenumbers.net/prptop/prptop.php}} | |
| \bibitem{primo} M. Martin. | |
| \newblock Primo for Linux, | |
| \url{www.ellipsa.eu/public/primo/primo.html}. | |
| \updated{\bibitem{mora} F. Morain. | |
| \newblock Some primes proven by my programs, \newline | |
| \url{www.lix.polytechnique.fr/Labo/Francois.Morain/Primes/myprimes.html}.} | |
| \bibitem{oeis} OEIS Foundation Inc. | |
| \newblock The On-Line Encyclopedia of Integer Sequences, | |
| \url{oeis.org}. | |
| \bibitem{llr} J. Penn\'e. | |
| \newblock LLR version 3.8.15, | |
| \url{jpenne.free.fr/index2.html}. | |
| \bibitem{srsieve} G. Reynolds. | |
| \newblock Sierpinski/Riesel sieve version 0.6.17, \newline | |
| \url{sites.google.com/site/geoffreywalterreynolds/programs/srsieve}. | |
| \updated{\bibitem{miracl} M. Scott. | |
| \newblock Multiprecision integer and rational arithmetic | |
| cryptographic library, | |
| \url{www.certivox.com/miracl}.} | |
| \bibitem{Sh00} J. Shallit. | |
| \newblock Minimal primes. | |
| \newblock \textit{J. Recreational Math.} {\bf 30} (2) (2001), 113--117. | |
| \bibitem{WD} H. C. Williams and H. Dubner. | |
| \newblock The primality of $R_{1031}$. | |
| \newblock \textit{Math. Comput.} {\bf 47} (1986), 703--711. | |
| \end{thebibliography} | |
| \end{document} |