Skip to content

Conversation

SomeoneSerge
Copy link
Collaborator

@SomeoneSerge SomeoneSerge commented Dec 24, 2023

CC @jpetrucciani from ggml-org#1724, could you confirm this still works?
CC @reckenrode for general review on darwin things.

This might either go into ggml-org#4605 directly or as a follow-up PR

@SomeoneSerge SomeoneSerge changed the base branch from master to nix December 24, 2023 19:37
Copy link
Owner

@philiptaron philiptaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log start
main: build = 0 (unknown)
main: built with clang version 16.0.6 for arm64-apple-darwin
main: seed  = 1703459265
llama_model_loader: loaded meta data with 22 key-value pairs and 363 tensors from Downloads/daringmaid-13b.Q5_K_S.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = LLaMA v2
llama_model_loader: - kv   2:                       llama.context_length u32              = 4096
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 5120
llama_model_loader: - kv   4:                          llama.block_count u32              = 40
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 13824
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 40
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 40
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  10:                       llama.rope.freq_base f32              = 10000.000000
llama_model_loader: - kv  11:                          general.file_type u32              = 16
llama_model_loader: - kv  12:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  13:                      tokenizer.ggml.tokens arr[str,32000]   = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv  14:                      tokenizer.ggml.scores arr[f32,32000]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  15:                  tokenizer.ggml.token_type arr[i32,32000]   = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - kv  16:                      tokenizer.ggml.merges arr[str,61249]   = ["▁ t", "e r", "i n", "▁ a", "e n...
llama_model_loader: - kv  17:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  18:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  19:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  20:            tokenizer.ggml.padding_token_id u32              = 2
llama_model_loader: - kv  21:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   81 tensors
llama_model_loader: - type q5_K:  281 tensors
llama_model_loader: - type q6_K:    1 tensors
llm_load_vocab: special tokens definition check successful ( 259/32000 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = SPM
llm_load_print_meta: n_vocab          = 32000
llm_load_print_meta: n_merges         = 0
llm_load_print_meta: n_ctx_train      = 4096
llm_load_print_meta: n_embd           = 5120
llm_load_print_meta: n_head           = 40
llm_load_print_meta: n_head_kv        = 40
llm_load_print_meta: n_layer          = 40
llm_load_print_meta: n_rot            = 128
llm_load_print_meta: n_gqa            = 1
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: n_ff             = 13824
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_yarn_orig_ctx  = 4096
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: model type       = 13B
llm_load_print_meta: model ftype      = Q5_K - Small
llm_load_print_meta: model params     = 13.02 B
llm_load_print_meta: model size       = 8.36 GiB (5.51 BPW)
llm_load_print_meta: general.name     = LLaMA v2
llm_load_print_meta: BOS token        = 1 '<s>'
llm_load_print_meta: EOS token        = 2 '</s>'
llm_load_print_meta: UNK token        = 0 '<unk>'
llm_load_print_meta: PAD token        = 2 '</s>'
llm_load_print_meta: LF token         = 13 '<0x0A>'
llm_load_tensors: ggml ctx size       =    0.14 MiB
ggml_backend_metal_buffer_from_ptr: allocated buffer, size =  8557.58 MiB, ( 8557.64 / 16384.02)
llm_load_tensors: system memory used  = 8556.07 MiB
...................................................................................................
llama_new_context_with_model: n_ctx      = 512
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2
ggml_metal_init: picking default device: Apple M2
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: loading '/nix/store/y2nb3cmrkr82kb6jc96ki82k4z6r8mqq-llama.cpp/bin/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M2
ggml_metal_init: GPU family: MTLGPUFamilyApple8 (1008)
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 17179.89 MB
ggml_metal_init: maxTransferRate               = built-in GPU
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =   400.00 MiB, ( 8959.20 / 16384.02)
llama_new_context_with_model: KV self size  =  400.00 MiB, K (f16):  200.00 MiB, V (f16):  200.00 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =     0.02 MiB, ( 8959.22 / 16384.02)
llama_build_graph: non-view tensors processed: 844/844
llama_new_context_with_model: compute buffer total size = 78.19 MiB
ggml_backend_metal_buffer_type_alloc_buffer: allocated buffer, size =    75.02 MiB, ( 9034.22 / 16384.02)

system_info: n_threads = 4 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
main: interactive mode on.
Reverse prompt: '

### Instruction:

'
sampling:
	repeat_last_n = 64, repeat_penalty = 1.100, frequency_penalty = 0.000, presence_penalty = 0.000
	top_k = 40, tfs_z = 1.000, top_p = 0.950, min_p = 0.050, typical_p = 1.000, temp = 0.800
	mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampling order:
CFG -> Penalties -> top_k -> tfs_z -> typical_p -> top_p -> min_p -> temp
generate: n_ctx = 512, n_batch = 512, n_predict = 2000, n_keep = 99


== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

 Below is an instruction that describes a task. Write a response that appropriately completes the request.

Instruction:

Herbert Hoover was the American president that oversaw the beginning of the Great Depression.
Describe his early career.
Pay close attention to business ventures and geography.
Present a year-by-year summary from 1895 through 1920.

Response:

Herbert Hoover, born in Iowa in 1874, was an American president who oversaw the beginning of the Great Depression. His early career was marked by notable business ventures and geographical transitions. The following is a year-by-year summary from 1895 to 1920:

1895: At age 21, Herbert Hoover graduated from Stanford University with a degree in geology. He began his career as a mining engineer in the western United States, working primarily in California and Nevada. His first job was with the famous British-owned mining company, Bewcastle Mining Company.

1896: Hoover moved to Australia to work for another British mining firm, Canton Bar Mine and Railway Company. He spent two years there before returning to the United States in 1898.

1899: Upon his return, Hoover accepted a position with the American-based mining company, Northern Mining Company, based in North Carolina. This was his first exposure to working for an American company.

1900: Hoover moved again, this time to China, where he worked as a consultant for various Chinese and British mining companies. He spent five years in China, learning about different cultures and business practices.

1905: After China, Herbert Hoover moved to London, England, where he founded his own firm, Hoover & Company, specializing in mining engineering and consulting services for international clients. This marked his entry into the world of international business.

1906: He returned briefly to the United States to marry Lou Henry, an accomplished geologist herself. They honeymooned in Alaska, exploring its natural wonders and potential mineral resources.

1907: The couple moved back to London, where Hoover continued running his successful company while also serving as a director of various other companies across Europe and the United States.

1912: Herbert Hoover was appointed chairman of the Federal Trade Commission (FTC) by President William Howard Taft, marking his first foray into government service. He held this position until 1920.

1914: With the onset of World War I, Hoover took on a new role as head of the American Relief Administration (ARA), an organization that provided humanitarian aid to millions of people affected by the war in Europe. His leadership in this capacity earned him widespread recognition and praise.

1917: President Woodrow Wilson appointed Hoover as the United States Food Administrator, tasked with overseeing food production and distribution during World War I. He introduced initiatives like rationing and price controls to conserve resources and ensure fairness in access to essential goods.

1920: After serving his term at the FTC and various other roles during wartime, Herbert Hoover returned to private business ventures. He founded the Hoover Institute of Public Policy at Stanford University, demonstrating his continued interest in public service and policy issues.

@philiptaron philiptaron merged commit fea0239 into philiptaron:nix Dec 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants