In [2]:
import torch

In [3]:
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "/srv/climagent/deepseek_model"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="cpu",
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

: 

In [17]:
prompt = """You are a sophisticated scientist trained in scientific research and innovation.

Given the following key concepts extracted from a comprehensive knowledge graph, your task is to synthesize a novel research hypothesis. Your response should not only demonstrate deep understanding and rational thinking but also explore imaginative and unconventional applications of these concepts.

Consider this list of nodes and relationships from a knowledge graph between Silk and Energy Intensive.   
 The format of the graph is "node_1 -- relationship between node_1 and node_2 -- node_2 -- relationship between node_2 and node_3 -- node_3...."

Here is the graph:

silk –> provides –> biocompatibility –> possess –> biological materials –> has –> multifunctionality –> include –>self-cleaning –> include –> multifunctionality –> broad applicability in biomaterial design –> silk –> possess –> biopolymers –> possess –> silk –> is –> fibroin –> is –> silk –> broad applicability in biomaterial design –> multifunctionality –> include –> structural coloration –> exhibited by –> insects –> are –> energy-intensive

Here is an analysis of the concepts and relationships in the graph:\\n\\nDefinitions of Terms in the Knowledge Graph:

- Silk: A natural protein fiber produced by silkworms and spiders, known for its mechanical strength, flexibility, and biodegradability. In scientific applications, silk refers to silk fibroin, a protein-based biomaterial.
- Biocompatibility: The ability of a material to perform with an appropriate host response when applied in a medical context. It means the material does not provoke an immune response and integrates well with biological tissues.
- Biological Materials: Substances that are produced by or interact with living organisms. This includes both natural and engineered materials used in biomedical applications, such as proteins, cells, and biopolymers.
- Multifunctionality: The capacity of a material or system to perform multiple functions simultaneously. In biomaterials, this includes properties like mechanical strength, biodegradability, antimicrobial activity, and environmental responsiveness.
- Self-cleaning: A property of materials that enables them to repel or degrade contaminants without external assistance. Often linked with hydrophobic surfaces or photocatalytic activity.
- Broad Applicability in Biomaterial Design: The versatility of a material, such as silk, to be used across a range of applications, including tissue engineering, drug delivery, and sensors due to its adaptable properties.
- Biopolymers: Naturally occurring polymers produced by living organisms. These include polysaccharides, proteins, and nucleic acids. Biopolymers are typically biodegradable and biocompatible.
- Fibroin: The structural protein that forms the core of silk fibers, known for its crystalline β-sheet structure, which provides strength and stability to silk materials.
- Structural Coloration: A color-producing mechanism caused by micro- or nanostructures that interfere with light rather than by pigments. Found in butterfly wings and beetle shells.
- Insects: A class of arthropods characterized by three-part bodies, exoskeletons, and six legs. Many exhibit structural coloration due to microscopic surface structures.
- Energy-Intensive: Processes or organisms that require high amounts of energy for their biological functions. Insects with complex optical nanostructures for structural coloration often require significant metabolic energy for development.

---

Discussion of Relationships in the Knowledge Graph:

- Silk → provides → biocompatibility: Silk exhibits high biocompatibility, making it suitable for medical and tissue engineering applications. This is due to its natural origin, minimal immune response, and ability to degrade harmlessly in the body.
- Biocompatibility → possess → biological materials: Biocompatibility is a defining property of many biological materials used in medicine. These materials interact safely and effectively with biological systems.
- Biological materials → has → multifunctionality: Many biological materials are multifunctional, enabling them to perform various roles, such as structural support, signaling, or interaction with cells.
- Multifunctionality → include → self-cleaning: One functional attribute that biological materials can exhibit is self-cleaning, which enhances longevity and reduces maintenance, especially in biomaterials exposed to biological environments.
- Self-cleaning → include → multifunctionality: This recursive structure suggests self-cleaning is not only a product of multifunctionality but also a contributor to a broader multifunctional profile in materials.
- Multifunctionality → broad applicability in biomaterial design → silk: The multifunctionality of silk underpins its wide-ranging use in biomaterial design. This includes uses in wound dressings, scaffolds, and optical devices.
- Silk → possess → biopolymers: Silk itself is composed of biopolymers—specifically, proteins such as fibroin. These polymers contribute to its mechanical and biological properties.
- Biopolymers → possess → silk: Silk, as a biopolymer-based material, exemplifies the properties and potential of biopolymers in engineering contexts.
- Silk → is → fibroin: Fibroin is the core protein component of silk fibers. It defines the structural and functional attributes of silk, especially its mechanical strength and biodegradability.
- Silk → broad applicability in biomaterial design → multifunctionality: Again reinforcing that the multifunctional nature of silk enables its integration into diverse biomedical and engineering applications.
- Multifunctionality → include → structural coloration: Among its multifunctional features, silk can exhibit or be engineered to display structural coloration, relevant in optics and responsive materials.
- Structural coloration → exhibited by → insects: Structural coloration is naturally present in insects, such as butterflies and beetles, due to nanostructures on their surfaces.
- Insects → are → energy-intensive: The biological production of structural coloration in insects is metabolically demanding, reflecting a significant energy investment in these micro/nanostructures.\\n\\n

Analyze the graph deeply and carefully, then craft a detailed research hypothesis that investigates a likely groundbreaking aspect that incorporates EACH of these concepts. Consider the implications of your hypothesis and predict the outcome or behavior that might result from this line of investigation. Your creativity in linking these concepts to address unsolved problems or propose new, unexplored areas of study, emergent or unexpected behaviors, will be highly valued.

Be as quantitative as possible and include details such as numbers, sequences, or chemical formulas. Please structure your response in JSON format, with SEVEN keys:

"hypothesis" clearly delineates the hypothesis at the basis for the proposed research question.

Ensure your scientific hypothesis is both innovative and grounded in logical reasoning, capable of advancing our understanding or application of the concepts provided.

Here is an example structure for your response, in JSON format:

{{ "hypothesis": "..."}}

Remember, the value of your response is as scientific discovery, new avenues of scientific inquiry and potential technological breakthroughs, with details and solid reasoning.

Make sure to incorporate EACH of the concepts in the knowledge graph in your response."""

In [10]:
prompt = "who are you ?"

In [20]:
input_ids = tokenizer(prompt, return_tensors="pt")["input_ids"]
print(f"Prompt length: {input_ids.shape[-1]} tokens")

Prompt length: 1559 tokens


In [None]:
from transformers import pipeline
nlp_pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer)
response = nlp_pipeline(prompt, max_new_tokens=64, num_return_sequences=1, do_sample=True, temperature=0.5)

Device set to use cpu


In [22]:
print(response[0]['generated_text'])

You are a sophisticated scientist trained in scientific research and innovation.

Given the following key concepts extracted from a comprehensive knowledge graph, your task is to synthesize a novel research hypothesis. Your response should not only demonstrate deep understanding and rational thinking but also explore imaginative and unconventional applications of these concepts.

Consider this list of nodes and relationships from a knowledge graph between Silk and Energy Intensive.   
 The format of the graph is "node_1 -- relationship between node_1 and node_2 -- node_2 -- relationship between node_2 and node_3 -- node_3...."

Here is the graph:

silk –> provides –> biocompatibility –> possess –> biological materials –> has –> multifunctionality –> include –>self-cleaning –> include –> multifunctionality –> broad applicability in biomaterial design –> silk –> possess –> biopolymers –> possess –> silk –> is –> fibroin –> is –> silk –> broad applicability in biomaterial design –> multif