In [17]:
# test text, objectives, artifacts.
!pip install dotenv PyMuPDF


1699.56s - pydevd: Sending message related to process being replaced timed-out after 5 seconds




In [7]:
# import stuff
from __future__ import annotations
import argparse, hashlib, json, os, sys
from datetime import datetime, UTC
from typing import List

from dotenv import load_dotenv
from openai import OpenAI
from pydantic import BaseModel, Field, ValidationError



load_dotenv()
client = OpenAI()
OPENAI_MODEL = os.getenv("OPENAI_MODEL", "gpt-4o")

gpt-4o


In [74]:
# artifact structure
# ── LLM function spec (updated) ─────────────────────────────────────────--
ART_FUNC = {
    "type": "function",
    "function": {
        "name": "artifact_generator",
        "description": "Generate artifact based on the following objective; machine, material, software, theory, workflow, principle, etc. \
            Include artifact name, description, inputs and outputs, and the laws that the artifact operates under.",
        "parameters": {
            "type": "object",
            "properties": {
                "artifacts": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "required": [
                            "name", "description", "inputs", "outputs", "laws"
                        ],
                        "properties": {
                            "name":  {"type": "string"},
                            "description": {"type": "string"},
                            "inputs": {"type": "string"},
                            "outputs": {"type": "string"},
                            "laws": {"type":"string"}
                        }
                    }
                }
            },
            "required": ["artifacts"]
        }
    }
}

In [70]:
SYS_PROMPT = (
    "You are a design engineer. Using ONLY the text and the objective, "
    "propose up to {k} concrete artifacts. Each artifact must include:\n"
    "• tool_anchor (machine, material, software, theory, workflow, principle, etc.)\n"
    "• a description of the artifact\n"
    "• the inputs and outputs of the artifact\n"
    "• the laws which the artifact operates under.\n"
    #"• how this artifact achieves the objective.\n"
    # "Return JSON via function schema." 
    )

In [58]:
# obj8 = "Build a list of artifacts that engineer {breathing in water} / {breathing} / {water}."
obj_relation = "Build a list of artifacts that are related to engineering electricity and magnetism."
obj_obj = "I want to engineer electrical fields and magnetic forces."
obj_flat = "What are the artifacts can be built from the information in this text?"
obj_target = "I want to engineer biological systems."
obj_existing = "What are the artifacts that exist/are in use based on the following text?"

In [19]:
import fitz  # PyMuPDF

library = "/Users/b/fantasiagenesis/crayon/library"

# List all files in the library (sorted for consistent ordering)
books = sorted(os.listdir(library))

# Select the second book (index 1 because indexing starts at 0)
book_path = os.path.join(library, books[1])

print(book_path)

# Read the content of the second book
# Read text from the PDF
with fitz.open(book_path) as doc:
    book_text = ""
    for page in doc:
        book_text += page.get_text()


print(len(book_text))

/Users/b/fantasiagenesis/crayon/library/Gene-Cloning-and-DNA-Analysis.pdf
822048


In [24]:
book_section = book_text[0:5000]
#print(book_section)

src = {}
src["title"] = "Gene Cloning and DNA Analysis"
src["text"] = book_section

In [71]:
obj = obj_existing
sys_msg = SYS_PROMPT.format(k=3)
print(sys_msg)
src = get_source(book_text, "Gene Cloning and DNA Analysis")
user_msg = f"OBJECTIVE:\n{obj}\n\nTITLE:\n{src["title"]}\nTEXT:\n{src["text"]}"
print(user_msg)

You are a design engineer. Using ONLY the text and the objective, propose up to 3 concrete artifacts. Each artifact must include:
• tool_anchor (machine, material, software, theory, workflow, principle, etc.)
• a description of the artifact
• the inputs and outputs of the artifact
• the laws which the artifact operates under.

OBJECTIVE:
What are the artifacts that exist/are in use based on the following text?

TITLE:
Gene Cloning and DNA Analysis
TEXT:
 Comparison between a typical glycosylation
structure found on a...
Figure 14.18 Three promoters frequently used in expression
factors for micro...
Figure 14.19 Crystalline inclusion bodies in the nuclei of
insect cells infe...
Figure 14.20 Transfer of the nucleus from a transgenic
somatic cell to an oo...
Figure 14.21 Recombinant protein production in the milk of a
transgenic shee...
Chapter 15
Figure 15.1 The structure of the insulin molecule and a
summary of its synth...
Figure 15.2 The synthesis of recombinant insulin from
artificia

In [79]:
resp = client.chat.completions.create(
        model=OPENAI_MODEL,
        messages=[{"role":"system","content":sys_msg}, {"role":"user","content":user_msg}],
        tools=[ART_FUNC], tool_choice="auto", temperature=0.5
)


In [80]:
import json
args_raw = resp.choices[0].message.tool_calls[0].function.arguments
raws = json.loads(args_raw)["artifacts"]

print(raws)

[{'name': 'Recombinant DNA Molecule', 'description': 'A recombinant DNA molecule is created by inserting a fragment of DNA, which contains a gene of interest, into a circular DNA molecule called a vector. This recombinant molecule is then introduced into a host cell, where it can replicate and produce multiple copies of the gene.', 'inputs': 'A fragment of DNA containing the gene of interest and a vector molecule.', 'outputs': 'A recombinant DNA molecule that can replicate within a host cell, producing multiple copies of the gene.', 'laws': 'Operates under the principles of molecular biology and genetic engineering, specifically the laws governing DNA replication and gene expression within host cells.'}, {'name': 'Polymerase Chain Reaction (PCR)', 'description': 'PCR is a technique used to amplify a specific segment of DNA, generating millions of copies of a particular DNA sequence. It involves repeated cycles of denaturation, annealing, and extension, carried out in a thermal cycler.'

In [125]:
def print_art(art, i=1):
    print(f"{i}. {art['name']}")
    print(f"   Description: {art['description']}")
    print(f"   Inputs:      {art['inputs']}")
    print(f"   Outputs:     {art['outputs']}")
    print(f"   Laws:        {art['laws']}")
    print()
    

def parse_func_resp(resp):
    args_raw = resp.choices[0].message.tool_calls[0].function.arguments
    raws = json.loads(args_raw)["artifacts"]

    arts = []
    for i, art in enumerate(raws, 1):
        print_art(art, i)
        arts.append(art)
    
    return arts

In [66]:
# be free of objectives. be free of money. just play. 
import random
def get_source(text, title, n=20000):
    n = 20000  # length of section in characters
    max_index = len(text) - n

    # Ensure we don't go out of bounds
    if max_index <= 0:
        book_section = text  # fallback: use the whole text
    else:
        index = random.randint(0, max_index)
        book_section = text[index : index + n]

    src = {}
    src["title"] = title
    src["text"] = book_section

    return src



In [91]:
artifacts = []
for x in range(0, 10):
    src = get_source(book_text, "Gene Cloning and DNA Analysis")

    sys_msg = SYS_PROMPT.format(k=3)
    user_msg = f"OBJECTIVE:\n{obj}\n\nTITLE:\n{src["title"]}\nTEXT:\n{src["text"]}"
    
    resp = client.chat.completions.create(
        model=OPENAI_MODEL,
        messages=[{"role":"system","content":sys_msg}, {"role":"user","content":user_msg}],
        tools=[ART_FUNC], tool_choice="auto", temperature=0.5
    )

    arts = parse_func_resp(resp)
    artifacts.extend(arts)
    

    

1. Next Generation Sequencing Machines
   Description: Modern machines capable of sequencing both ends of a DNA fragment to obtain paired end reads, crucial for genome assembly without relying on chain termination sequencing.
   Inputs:      DNA fragments of various lengths (e.g., 150 bp, 500 bp, 2 kb, 5 kb, 10 kb).
   Outputs:     Paired end reads for genome assembly.
   Laws:        Operates under the principles of DNA sequencing and molecular biology, utilizing technologies that allow for high-throughput sequencing and data analysis.

2. Northern Hybridization
   Description: A technique used to detect and measure RNA transcripts, indicating gene expression levels in various tissues or conditions.
   Inputs:      RNA extracts, labeled probes, agarose gel, denaturing electrophoresis buffer.
   Outputs:     Autoradiograph showing bands where transcripts are present, indicating expression levels.
   Laws:        Based on the principles of nucleic acid hybridization and electrophoresis,

In [95]:
ri = random.randint(0, len(artifacts))
artifact = artifacts[0]
print_art(artifact)

1. Next Generation Sequencing Machines
   Description: Modern machines capable of sequencing both ends of a DNA fragment to obtain paired end reads, crucial for genome assembly without relying on chain termination sequencing.
   Inputs:      DNA fragments of various lengths (e.g., 150 bp, 500 bp, 2 kb, 5 kb, 10 kb).
   Outputs:     Paired end reads for genome assembly.
   Laws:        Operates under the principles of DNA sequencing and molecular biology, utilizing technologies that allow for high-throughput sequencing and data analysis.



In [102]:
k = 3
SYS_PROMPT_OBJECTIVE = "You are a design engineer. You will be provided with an artifact describing a machine, material, theory, workflow or principle. \
     Your role is to execute the given command based on the provided artifact to the best of your ability."
sys_msg_objective = SYS_PROMPT_OBJECTIVE #.format(k=3)
obj_stories = f"Describe stories/usages of the following artifact, from beginning to end. Descibe {k} usages."
user_msg_objective = f"OBJECTIVE:\n{obj}\n\nARTIFACT:\n{artifact["name"]}\nDESCRIPTION:\n{artifact["description"]}\nINPUTS:\n{artifact["inputs"]}\nOUTPUTS:\n{artifact["outputs"]}\nLAWS:\n{artifact["laws"]}"

    

Hello


In [122]:
# what are all the dimensions of an artifact? 
# exmaple stories/usages
obj_stories = "Describe stories/usages of the following artifact, from beginning to end. Descibe {k} usages."
# workflows
obj_workflows = "Describe workflows utilizing the following artifact. Descibe {k} workflows."
# architecture # how to design and build
obj_architecture = "Describe the architecture of the following artifact. How is this system designed/built?"
# design space
obj_design = "What is the design space of the following artifact? What are the dimensions of that can be explored in the use of the following artifact?"
# limitations / constraints
obj_limitations = "What are the limitations/constraints of the following artifact? Describe {k} limitations."
# best working artifacts
obj_complement = "Describe artifacts that work in complement with the following artifacts. Describe {k} artifacts."
# emerging directions
obj_emerging = "Describe the emerging directions of in development/evolution/usage of the following artifact. Describe {k} emerging directions."
# competition / opponents
obj_competition = "Describe the artifacts that compete with the following artifact. Describe {k} competing artifacts."
# artifacts deviate radically from the norm
obj_deviate = "Describe usages of the following artifact that deviate radically from the norm. Describe {k} usages."
# natural phenomenon involved
obj_natural = "Describe the natural phenomenon underlying the operation of the following artifact."
# decision-making
obj_decision = "Describe the decision-making behind the operation/execution of the following artifact."
# domain relationship
obj_domain = "Describe the domain relationship behind the operation/execution of the following artifact."
# actions
obj_actions = "Describe the actions that can be carried out by the following artifact."




# mutations
obj_mutations = "Describe methods of "
# problems solved
# problem space
# evolution
# trajectory of innovation
# artifacts deviate radically from the norm
# relationship of this artifact with another. how to change this relationship. 
# foundational theory of the following artifact
# dead artifacts
# decision-making
# natural phenomenon involved
# systems composed of this artifact
# monetization
# functional space
# existing business ventures
# actions
# software
# artifacts in storytelling, education, government, money

In [123]:
# ri = random.randint(0, len(artifacts))
artifact = artifacts[ri]
print_art(artifact, ri)

obj = obj_deviate.format(k=3)

sys_msg_objective = SYS_PROMPT_OBJECTIVE
user_msg_objective = f"OBJECTIVE:\n{obj}\n\nARTIFACT:\n{artifact["name"]}\nDESCRIPTION:\n{artifact["description"]}\nINPUTS:\n{artifact["inputs"]}\nOUTPUTS:\n{artifact["outputs"]}\nLAWS:\n{artifact["laws"]}"


resp = client.chat.completions.create(
        model=OPENAI_MODEL,
        messages=[{"role":"system","content":sys_msg_objective}, {"role":"user","content":user_msg_objective}],
        #tools=[ART_FUNC], tool_choice="auto", 
        temperature=0.5
    )

print(resp.choices[0].message.content)



18. Oligonucleotide Directed Mutagenesis Kit
   Description: A laboratory kit designed to facilitate the process of oligonucleotide-directed mutagenesis, allowing researchers to introduce specific mutations into a gene of interest using synthetic oligonucleotides.
   Inputs:      Single-stranded DNA template, synthetic oligonucleotide with desired mutation, DNA polymerase, competent E. coli cells.
   Outputs:     Mutated double-stranded DNA, mutated recombinant protein.
   Laws:        Operates under the principles of DNA replication and annealing, governed by the laws of molecular biology and chemistry.

Certainly! Here are three unconventional usages of an Oligonucleotide Directed Mutagenesis Kit that deviate from its standard application in gene mutation studies:

1. **Synthetic Art Creation:**
   - **Usage:** The kit could be used as a medium for creating synthetic art at the molecular level. By introducing deliberate and artistic mutations into a gene sequence, researchers could d

In [139]:
ri1 = random.randint(0, len(artifacts))
ri2 = random.randint(0, len(artifacts))
while ri1==ri2:
    ri2 = random.randint(0, len(artifacts))

artifact1 = artifacts[ri1]
artifact2 = artifacts[ri2]

print_art(artifact1, ri1)
print_art(artifact2, ri2)

17. Spin Column Chromatography System
   Description: A material-based system used for accelerating ion exchange chromatography through centrifugation, aiding in the purification of nucleic acids or proteins.
   Inputs:      Cell lysates containing nucleic acids or proteins.
   Outputs:     Purified nucleic acids or proteins ready for further analysis.
   Laws:        Based on the principles of centrifugation and ion exchange chromatography.

10. Ancient DNA Analysis
   Description: A technique to extract and analyze DNA from archaeological or fossil specimens to study historical populations and migrations.
   Inputs:      Ancient DNA samples, sequencing reagents, bioinformatics tools.
   Outputs:     Genetic information about ancient human populations, evolutionary insights.
   Laws:        Follows the principles of genetics and molecular biology, including the Hardy-Weinberg equilibrium in population genetics.



In [140]:
def artifact_as_string(artifact):
    return f"ARTIFACT:\n{artifact["name"]}\n\nDESCRIPTION:\n{artifact["description"]}\n\nINPUTS:\n{artifact["inputs"]}\n\nOUTPUTS:\n{artifact["outputs"]}\n\nLAWS:\n{artifact["laws"]}"

In [141]:
SYS_PROMPT_FUSE = "You are a design engineer. You will be provided with an two different artifacts, each describing a machine, material, theory, workflow or principle. \
     Your role is to explore the relationship between these artifacts, based on the given objective to the best of your ability. OBJECTIVE: {obj}"


In [149]:
for x in range(0, 50):
    ri1 = random.randint(0, len(artifacts))
    ri2 = random.randint(0, len(artifacts))
    while ri1==ri2:
        ri2 = random.randint(0, len(artifacts))
    
    artifact1 = artifacts[ri1]
    artifact2 = artifacts[ri2]

    obj = "Describe the relationship between the following artifacts. Describe the strength of the relationship."
    sys_msg_objective = SYS_PROMPT_FUSE.format(obj)
    user_msg_objective = f"OBJECTIVE:\n{obj}\n\nARTIFACT 1:\n{artifact_as_string(artifact1)}\nARTIFACT 2:\n{artifact_as_string(artifact2)}\n"
    resp = client.chat.completions.create(
        model=OPENAI_MODEL,
        messages=[{"role":"system","content":sys_msg_objective}, {"role":"user","content":user_msg_objective}],
        #tools=[ART_FUNC], tool_choice="auto", 
        temperature=0.5
    )

    print(f"ARTIFACT {ri1}:\n{artifact_as_string(artifact1)}\nARTIFACT {ri2}:\n{artifact_as_string(artifact2)}\n\n{resp.choices[0].message.content}")


ARTIFACT 18:
ARTIFACT:
Spin Column Chromatography System

DESCRIPTION:
A material-based system used for accelerating ion exchange chromatography through centrifugation, aiding in the purification of nucleic acids or proteins.

INPUTS:
Cell lysates containing nucleic acids or proteins.

OUTPUTS:
Purified nucleic acids or proteins ready for further analysis.

LAWS:
Based on the principles of centrifugation and ion exchange chromatography.
ARTIFACT 19:
ARTIFACT:
Ancient DNA Analysis

DESCRIPTION:
A technique to extract and analyze DNA from archaeological or fossil specimens to study historical populations and migrations.

INPUTS:
Ancient DNA samples, sequencing reagents, bioinformatics tools.

OUTPUTS:
Genetic information about ancient human populations, evolutionary insights.

LAWS:
Follows the principles of genetics and molecular biology, including the Hardy-Weinberg equilibrium in population genetics.

The relationship between the Spin Column Chromatography System and Ancient DNA Analy

ARTIFACT 24:
ARTIFACT:
Spin Column Chromatography System

DESCRIPTION:
A material-based system used for accelerating ion exchange chromatography through centrifugation, aiding in the purification of nucleic acids or proteins.

INPUTS:
Cell lysates containing nucleic acids or proteins.

OUTPUTS:
Purified nucleic acids or proteins ready for further analysis.

LAWS:
Based on the principles of centrifugation and ion exchange chromatography.
ARTIFACT 21:
ARTIFACT:
Ancient DNA Analysis

DESCRIPTION:
A technique to extract and analyze DNA from archaeological or fossil specimens to study historical populations and migrations.

INPUTS:
Ancient DNA samples, sequencing reagents, bioinformatics tools.

OUTPUTS:
Genetic information about ancient human populations, evolutionary insights.

LAWS:
Follows the principles of genetics and molecular biology, including the Hardy-Weinberg equilibrium in population genetics.

The relationship between the Spin Column Chromatography System and Ancient DNA Analy

In [147]:
obj_relation1 = "Describe the relationship between the following artifacts. Describe the strength of the relationship."
obj_relation2 = "Describe a workflow using the following two artifacts."


The relationship between the Spin Column Chromatography System and Ancient DNA Analysis is primarily based on their roles in the purification and analysis of nucleic acids, respectively. Here's a breakdown of their relationship:

1. **Purpose and Functionality**:
   - **Spin Column Chromatography System**: This system is designed to purify nucleic acids (DNA/RNA) or proteins from complex mixtures such as cell lysates. It leverages centrifugation and ion exchange chromatography to efficiently separate and purify these biomolecules.
   - **Ancient DNA Analysis**: This technique involves extracting and analyzing DNA from ancient specimens to gain insights into historical populations and evolution. The analysis relies heavily on obtaining high-quality, uncontaminated DNA samples for accurate sequencing and interpretation.

2. **Interdependence**:
   - The Spin Column Chromatography System can play a crucial role in Ancient DNA Analysis by providing a method to purify ancient DNA samples. G