Permalink
Browse files

initial release

  • Loading branch information...
1 parent 25d5e70 commit eabe79160117019e0c385be2dee36c99c64e1cc4 @jwatte jwatte committed Sep 24, 2012
Showing with 30,726 additions and 1 deletion.
  1. +116 −0 Makefile
  2. +163 −1 README.md
  3. +16 −0 TESTING.md
  4. +7 −0 build.sh
  5. +124 −0 configure
  6. +380 −0 daemon/AdminServer.cpp
  7. +37 −0 daemon/AdminServer.h
  8. +125 −0 daemon/AllKeys.cpp
  9. +63 −0 daemon/AllKeys.h
  10. +143 −0 daemon/Argument.cpp
  11. +109 −0 daemon/Argument.h
  12. +77 −0 daemon/Debug.cpp
  13. +30 −0 daemon/Debug.h
  14. +447 −0 daemon/EagerConnection.cpp
  15. +126 −0 daemon/EagerConnection.h
  16. +110 −0 daemon/EagerConnectionFactory.cpp
  17. +30 −0 daemon/EagerConnectionFactory.h
  18. +20 −0 daemon/FakeEagerConnection.h
  19. +43 −0 daemon/FakeHttpRequest.h
  20. +99 −0 daemon/FakeStatStore.cpp
  21. +31 −0 daemon/FakeStatStore.h
  22. +376 −0 daemon/HttpServer.cpp
  23. +155 −0 daemon/HttpServer.h
  24. +48 −0 daemon/IComplete.h
  25. +34 −0 daemon/IStatCounter.h
  26. +15 −0 daemon/IStatCounterFactory.h
  27. +72 −0 daemon/IStatStore.h
  28. +21 −0 daemon/IStatWriter.h
  29. +8 −0 daemon/Logs.cpp
  30. +9 −0 daemon/Logs.h
  31. +187 −0 daemon/LoopbackCounter.cpp
  32. +80 −0 daemon/LoopbackCounter.h
  33. +18 −0 daemon/MetaInfo.h
  34. +68 −0 daemon/PduProtocol.cpp
  35. +137 −0 daemon/PduProtocol.h
  36. +102 −0 daemon/ReplicaOf.cpp
  37. +43 −0 daemon/ReplicaOf.h
  38. +114 −0 daemon/ReplicaPdus.cpp
  39. +162 −0 daemon/ReplicaPdus.h
  40. +158 −0 daemon/ReplicaServer.cpp
  41. +41 −0 daemon/ReplicaServer.h
  42. +884 −0 daemon/RequestInFlight.cpp
  43. +69 −0 daemon/RequestInFlight.h
  44. +91 −0 daemon/Retention.cpp
  45. +42 −0 daemon/Retention.h
  46. +548 −0 daemon/Settings.cpp
  47. +63 −0 daemon/Settings.h
  48. +172 −0 daemon/ShardedMap.h
  49. +663 −0 daemon/StatCounter.cpp
  50. +68 −0 daemon/StatCounter.h
  51. +21 −0 daemon/StatCounterFactory.cpp
  52. +26 −0 daemon/StatCounterFactory.h
  53. +724 −0 daemon/StatServer.cpp
  54. +174 −0 daemon/StatServer.h
  55. +424 −0 daemon/StatStore.cpp
  56. +98 −0 daemon/StatStore.h
  57. +34 −0 daemon/Timing.cpp
  58. +22 −0 daemon/Timing.h
  59. +1,015 −0 daemon/main.cpp
  60. +122 −0 daemon/test_main.inl
  61. +11 −0 daemon/threadfunc.h
  62. +929 −0 debian/changelog
  63. +1 −0 debian/compat
  64. +12 −0 debian/control
  65. +36 −0 debian/copyright
  66. +1 −0 debian/docs
  67. +6 −0 debian/istatd.dirs
  68. +19 −0 debian/postinst
  69. +7 −0 debian/postrm
  70. +47 −0 debian/preinst
  71. +24 −0 debian/rules
  72. +87 −0 files/agents.html
  73. BIN files/autorefresh-active.png
  74. BIN files/autorefresh.png
  75. BIN files/closebox-hilite.png
  76. BIN files/closebox-press.png
  77. BIN files/closebox.png
  78. BIN files/dashboard-icon.png
  79. BIN files/dropdown-arrow.png
  80. +1 −0 files/dygraph-combined.js
  81. BIN files/favicon.ico
  82. BIN files/filled-check.png
  83. +47 −0 files/form.html
  84. BIN files/graph-close.png
  85. BIN files/graph-remove-counter.png
  86. BIN files/graph-zoomout.png
  87. +2,050 −0 files/graph.js
  88. +128 −0 files/index.html
  89. +4 −0 files/jquery.min.js
  90. BIN files/refresh.png
  91. BIN files/resize-grabby.png
  92. BIN files/search-icon.png
  93. +177 −0 files/sha256.js
  94. BIN files/slider-knob.png
  95. BIN files/slider-marker.png
  96. +974 −0 files/styles.css
  97. BIN files/tree-leaf.png
  98. BIN files/tree-minus.png
  99. BIN files/tree-plus.png
  100. BIN files/unfilled-check.png
  101. +9 −0 ftest/agent.cfg
  102. +50 −0 ...ets_on_interval_boundry_when_limiting_number_of_samples_and_start_time_not_a_multiple_of_interval
  103. +38 −0 ftest/expected/test_json_interface/GET_does_not_return_buckets_where_there_are_gaps
  104. +23 −0 ...n_interface/GET_does_not_return_buckets_where_there_are_gaps_and_when_samples_are_limited_by_user
  105. +9 −0 ...pected/test_json_interface/GET_increases_size_of_bucket_when_max_samples_is_limited_by_the_caller
  106. +94 −0 ftest/expected/test_json_interface/GET_returns_15_minutes_only_when_end_is_specified
  107. +94 −0 ftest/expected/test_json_interface/GET_returns_15_minutes_only_when_start_is_specified
  108. +5 −0 ...xpected/test_json_interface/GET_returns_1_bucket_when_diff_of_end_and_start_is_less_than_interval
  109. +4 −0 ...test_json_interface/GET_returns_empty_data_when_start_and_end_are_before_than_oldest_data_in_file
  110. +4 −0 .../GET_returns_empty_data_when_start_and_end_are_more_than_an_hour_later_than_the_last_data_in_file
  111. +2 −0 ftest/expected/test_json_interface/GET_returns_error_when_end_is_less_than_start
  112. +2 −0 ftest/expected/test_json_interface/GET_returns_error_when_end_is_negative
  113. +2 −0 ftest/expected/test_json_interface/GET_returns_error_when_start_is_negative
  114. +14 −0 ...ace/GET_returns_finest_resolution_when_max_samples_is_greater_than_number_of_samples_in_the_range
  115. +92 −0 ...ected/test_json_interface/GET_returns_last_15_minutes_of_data_when_start_and_end_is_not_specified
  116. +26 −0 ...pected/test_json_interface/GET_returns_normalized_start_and_stop_when_data_missing_from_both_ends
  117. +1 −0 ftest/expected/test_json_interface/POST_requires_keys_array
  118. +43 −0 ftest/expected/test_json_interface/POST_returns_a_value_when_one_counter_specified
  119. +57 −0 ftest/expected/test_json_interface/POST_returns_a_value_when_two_counters_are_specified
  120. +2 −0 ftest/expected/test_json_interface/POST_returns_error_when_end_before_start
  121. +2 −0 ftest/expected/test_json_interface/POST_returns_error_when_end_negative
  122. +2 −0 ftest/expected/test_json_interface/POST_returns_error_when_start_negative
  123. +57 −0 ftest/expected/test_json_interface/POST_returns_normalized_start_and_stop_times
  124. +50 −0 ftest/expected/test_listen_address/restricted_listen_address_on_correct_ip
  125. +2 −0 ftest/expected/test_listen_address/restricted_listen_address_on_incorrect_ip
  126. +34 −0 ftest/expected/test_reduction/reduction_test_10s
  127. +11 −0 ftest/expected/test_reduction/reduction_test_5m
  128. +334 −0 ftest/functions
  129. +10 −0 ftest/master.cfg
  130. +8 −0 ftest/restrictedlisten.cfg
  131. +8 −0 ftest/single.cfg
  132. +7 −0 ftest/slave.cfg
  133. +9 −0 ftest/store_n_forward.cfg
  134. +40 −0 ftest/test_agent_to_master.sh
  135. +36 −0 ftest/test_counter_incrementing.sh
  136. +43 −0 ftest/test_counter_list.sh
  137. +26 −0 ftest/test_eager_connection_leak.sh
  138. +13 −0 ftest/test_freeze_time.sh
  139. +50 −0 ftest/test_import_old_data.sh
  140. +103 −0 ftest/test_json_interface.sh
  141. +18 −0 ftest/test_list_matching.sh
  142. +27 −0 ftest/test_listen_address.sh
  143. +38 −0 ftest/test_nums2file.sh
  144. +29 −0 ftest/test_reduction.sh
  145. +22 −0 ftest/test_replication.sh
  146. +34 −0 ftest/test_store_n_forward_to_master.sh
  147. +17 −0 include/istat/Atomic.h
  148. +42 −0 include/istat/Bucket.h
  149. +42 −0 include/istat/Env.h
  150. +73 −0 include/istat/Header.h
  151. +36 −0 include/istat/IRecorder.h
  152. +128 −0 include/istat/Log.h
  153. +34 −0 include/istat/Mmap.h
  154. +117 −0 include/istat/StatFile.h
  155. +20 −0 include/istat/istattime.h
  156. +59 −0 include/istat/strfunc.h
  157. +95 −0 include/istat/test.h
  158. +249 −0 include/json/json-forwards.h
  159. +1,855 −0 include/json/json.h
  160. +111 −0 istatd-init.sh
  161. +17 −0 istatd.cfg
  162. +5 −0 istatd.default
  163. +13 −0 istatd.settings
  164. +38 −0 lib/Atomic.cpp
  165. +145 −0 lib/Bucket.cpp
  166. +33 −0 lib/Env.cpp
  167. +9 −0 lib/Header.cpp
  168. +188 −0 lib/Log.cpp
  169. +74 −0 lib/LogFormatterLog.cpp
  170. +80 −0 lib/LogInstanceFile.cpp
  171. +33 −0 lib/LogInstanceFile.h
  172. +206 −0 lib/Mmap.cpp
  173. +32 −0 lib/RecordStat.cpp
  174. +688 −0 lib/StatFile.cpp
  175. +50 −0 lib/istattime.cpp
  176. +4,196 −0 lib/jsoncpp.cpp
  177. +758 −0 lib/strfunc.cpp
  178. +222 −0 lib/test.cpp
  179. +51 −0 make.def
  180. +1 −0 settings/cit.set
  181. +3 −0 settings/users.set
  182. +27 −0 splitter/README.txt
  183. +288 −0 splitter/main.cpp
  184. +16 −0 test.cfg
  185. +2 −0 test/spawnload.sh
  186. +43 −0 test/test_Bucket.cpp
  187. +29 −0 test/test_Debug.cpp
  188. +65 −0 test/test_Env.cpp
  189. +93 −0 test/test_Log.cpp
  190. +64 −0 test/test_LoopbackCounter.cpp
  191. +80 −0 test/test_PduProtocol.cpp
  192. +104 −0 test/test_PduReaderActor.cpp
  193. +20 −0 test/test_RecordStat.cpp
  194. +152 −0 test/test_Settings.cpp
  195. +716 −0 test/test_StatCounter.cpp
  196. +55 −0 test/test_StatCounterFactory.cpp
  197. +371 −0 test/test_StatFile.cpp
  198. +149 −0 test/test_StatServer.cpp
  199. +144 −0 test/test_StatStore.cpp
  200. +28 −0 test/test_Timing.cpp
  201. +23 −0 test/test_boost.cpp
  202. +97 −0 test/test_istattime.cpp
  203. +427 −0 test/test_strfunc.cpp
  204. +11 −0 tool/deploy-ui.sh
  205. +37 −0 tool/exampleClientProgram.py
  206. +354 −0 tool/istatd_filedump.cpp
  207. +159 −0 tool/istatd_filegen.cpp
  208. +84 −0 tool/istatd_fileinfo.cpp
  209. +141 −0 tool/istatd_flush.cpp
  210. +93 −0 tool/istatd_import.cpp
  211. +146 −0 tool/istatd_lint.cpp
  212. +340 −0 tool/istatd_loadtest.cpp
  213. +90 −0 tool/istatd_nums2file.cpp
  214. +150 −0 tool/istatd_purge.cpp
  215. +132 −0 tool/istatd_query.py
  216. +45 −0 tool/istatd_sleep.cpp
  217. +162 −0 tool/istatd_stat.cpp
  218. +104 −0 tool/istatd_tcp_memleak.pl
  219. +89 −0 tool/random_numbers.cpp
  220. +120 −0 tool/rrd2istatd.py
  221. +207 −0 tool/rrdimport.py
View
116 Makefile
@@ -0,0 +1,116 @@
+-include makevars.config
+TEST_SRCS:=$(wildcard test/*.cpp)
+TEST_OBJS:=$(patsubst %.cpp,obj/%.o,$(TEST_SRCS))
+TESTS:=$(patsubst test/%.cpp,%,$(TEST_SRCS))
+TOOL_SRCS:=$(wildcard tool/*.cpp)
+TOOL_OBJS:=$(patsubst %.cpp,obj/%.o,$(TOOL_SRCS))
+TOOLS:=$(patsubst tool/%.cpp,%,$(TOOL_SRCS))
+LIB_SRCS:=$(wildcard lib/*.cpp)
+LIB_OBJS:=$(patsubst %.cpp,obj/%.o,$(LIB_SRCS))
+LIBS:=istatdaemon istat
+DIR_DEPS:=obj obj/lib obj/daemon obj/test obj/tool obj/splitter bin
+FTEST_FILES:=$(wildcard ftest/test_*.sh)
+DAEMON_MAIN_SRC:=daemon/main.cpp
+DAEMON_MAIN_OBJ:=$(patsubst %.cpp,obj/%.o,$(DAEMON_MAIN_SRC))
+DAEMON_SRCS:=$(filter-out $(DAEMON_MAIN_SRC),$(wildcard daemon/*.cpp))
+DAEMON_OBJS:=$(patsubst %.cpp,obj/%.o,$(DAEMON_SRCS))
+SPLITTER_MAIN_SRC:=splitter/main.cpp
+SPLITTER_MAIN_OBJ:=$(patsubst %.cpp,obj/%.o,$(SPLITTER_MAIN_SRC))
+DAEMONS:=istatd splitd
+DEPS:=$(sort $(patsubst %.o,%.d,$(TEST_OBJS) $(TOOL_OBJS) $(LIB_OBJS) $(DAEMON_OBJS) $(DAEMON_MAIN_OBJ) $(SPLITTER_MAIN_OBJ)))
+BINS:=$(patsubst %,bin/%,$(TESTS) $(TOOLS) $(DAEMONS))
+LIB_DEPS:=$(foreach lib,$(LIBS),obj/lib$(lib).a)
+HDRS:=$(wildcard include/istat/*.h) $(wildcard include/json/*.h)
+TESTS_TO_RUN:=
+FILES_SRCS:=$(wildcard files/*)
+SETTINGS_SRCS:=$(wildcard settings/*)
+
+DESTDIR?=/
+USR_PREFIX?=$(DESTDIR)/usr
+VAR_PREFIX?=$(DESTDIR)/var
+ETC_PREFIX?=$(DESTDIR)/etc
+INSTALL?=install -C -D
+TOUCH=touch
+
+INSTALL_DIRS:=
+INSTALL_DSTS:=
+CXX:=g++#./gstlfilt/gfilt
+LXXFLAGS:=-Lobj/ $(patsubst %,-l%,$(LIBS))
+ifeq ($(OPT),)
+OPT := -O2
+endif
+CXXFLAGS:=-pipe $(OPT) -g -Iinclude -MMD -D_LARGEFILE64_SOURCE -Wall -Werror
+SYS_LIBS:=$(BOOST_SYSTEM) -lboost_thread -lboost_signals -lpthread $(STATGRAB) $(BOOST_FILESYSTEM) -lboost_date_time
+
+all: $(DIR_DEPS) $(LIB_DEPS) $(BINS) tests ftests
+
+dpkg:
+ env DEB_BUILD_OPTIONS="nostrip" debuild -us -uc
+ @echo done
+
+build: $(DIR_DEPS) $(BINS)
+ @echo "build done"
+
+clean:
+ rm -fr obj bin testdata/* /tmp/test /var/tmp/test /tmp/ss.test
+
+distclean: clean
+ rm -f makevars.config
+
+killall:
+ killall -q istatd || true
+
+-include make.def
+
+obj/libistat.a: $(LIB_OBJS)
+ ar cr $@ $^
+$(eval $(call add_install,obj/libistat.a,$(USR_PREFIX)/lib/libistat.a,664))
+HEADERS:=$(wildcard include/istat/*)
+$(foreach hfile,$(HEADERS),$(eval $(call add_install,$(hfile),$(USR_PREFIX)/$(hfile),664)))
+
+obj/libistatdaemon.a: $(DAEMON_OBJS)
+ ar cr $@ $^
+
+bin/istatd: $(DAEMON_MAIN_OBJ) $(LIB_DEPS)
+ $(CXX) -g $(DAEMON_MAIN_OBJ) -o $@ $(LXXFLAGS) $(SYS_LIBS)
+$(eval $(call add_install,bin/istatd,$(USR_PREFIX)/bin/istatd,775))
+bin/splitd: $(SPLITTER_MAIN_OBJ) $(LIB_DEPS)
+ $(CXX) -g $(SPLITTER_MAIN_OBJ) -o $@ $(LXXFLAGS) $(SYS_LIBS)
+$(eval $(call add_install,bin/splitd,$(USR_PREFIX)/bin/splitd,775))
+
+$(foreach test,$(TESTS),$(eval $(call build_test,$(test))))
+$(foreach tool,$(TOOLS),$(eval $(call build_tool,$(tool))))
+$(foreach dir,obj bin obj/test obj/tool obj/daemon obj/lib obj/splitter,$(eval $(call build_dir,$(dir))))
+
+obj/%.o: %.cpp
+ $(CXX) $(CXXFLAGS) -c -o $@ $<
+
+# tests only require libs to be built
+tests: $(DIR_DEPS) $(patsubst %,run_%,$(TESTS_TO_RUN))
+ @echo "tests complete"
+
+# ftests require istatd to be built
+ftests: $(DIR_DEPS) $(BINS) tests $(FTEST_FILES)
+ @for ft in $(FTEST_FILES); do echo "\n============================================\nftest $$ft"; $$ft || exit 1; done
+ bin/istatd --test --config test.cfg
+ @echo "ftests complete"
+
+-include $(DEPS)
+
+$(eval $(call add_install,istatd.default,$(ETC_PREFIX)/default/istatd,755))
+$(eval $(call add_install,istatd-init.sh,$(ETC_PREFIX)/init.d/istatd,755))
+$(foreach set,$(SETTINGS_SRCS),$(eval $(call add_precious_install,$(set),$(VAR_PREFIX)/db/istatd/$(set),664)))
+$(eval $(call add_precious_install,istatd.settings,$(ETC_PREFIX)/istatd.cfg,644))
+$(foreach file,$(FILES_SRCS),$(eval $(call add_install,$(file),$(USR_PREFIX)/share/istatd/files/$(notdir $(file)),664)))
+
+# add install must go before make directories
+$(foreach dir,$(INSTALL_DIRS),$(eval $(call mk_install_dir,$(patsubst %/,%,$(dir)))))
+$(call mk_install_dir,$(VAR_PREFIX)/db/istatd)
+
+install: $(INSTALL_DIRS) $(INSTALL_DSTS)
+ #update-rc.d istatd defaults
+ @echo done
+
+uninstall:
+ rm -f $(INSTALL_DSTS)
+ rm -f $(ETC_PREFIX)/rc*.d/*istatd
View
164 README.md
@@ -1,4 +1,166 @@
istatd
======
-Real-time metrics gathering, recording, and graphing
+Check with jwatte@imvu.com for more information. Not currently released
+for redistribution! All rights reserved. Copyright 2011 IMVU, Inc.
+
+The purpose of istatd is to efficiently collect, store and retrieve
+named statistics from a large number of sources. This is similar to
+Cacti, Graphite, Zabbix, and a bunch of other systems. In fact, istatd
+stated out as a storage back-end for Graphite, to replace the built-in
+carbon back-end. The specific goals of this system are:
+
+- Support 100,000+ distinct counters with 3 different frequencies and
+ retention ages, the shortest frequency being 10 seconds.
+- Automatically creating new counter files for all new counters that
+ samples are received for.
+- Calculating statistics for each retention bucket -- minimum, maximum,
+ average, standard deviation.
+
+For more documentation than what is found in this file, see:
+- https://github.com/imvu/istatd/wiki
+
+This program is implemented in C++ using boost::asio for asynchronous,
+multi-threaded, evented net handling. One version of this program used
+mmap() to do counter I/O asynchronously using madvise() and msync().
+However, this ended up being a real performance problem, because the
+Linux kernel has one big tree (not hash table) to manage memory
+regions, and this tree is protected using a single lock, serializing
+all calls to mmap() and searching through a very deep tree for each
+call. Trying to start up, loading a few hundred thousand counter files,
+each of which has 3 mmap() calls, ends up choking the CPU and failing
+bad. Thus, the I/O interface is in terms of mmap() operations, but
+the actual implementation uses lseek() and read()/write(). A periodic
+timer iterates over all counters to make sure they are flushed to
+disk about once every 5 minutes. (Check the --flush option)
+
+Any counter will get a default retention, configured as 10 second data
+for a few days, 5 minute data for a few months, and 1 hour data for a
+few years. All updates go into all of the buckets -- there is no
+decimation, only aggregation (statistics like min, max, average and
+standard deviation are collected/calculated).
+
+Some random thoughts
+--------------------
+
+- Stats are received on a TCP port in a simple line-based format, or
+ on a UDP port in the same format.
+- A running istatd can serve up statistics over a simple HTTP
+ interface on a port that you configure.
+- Files in the "files" directory get served from a URL named
+ /?f=filename -- directory paths are not supported!
+- A running istatd can forward all stats it receives to another
+ instance. This allows read slave trees, simple replication, etc.
+- The istatd can collect information about the local machine (network,
+ CPU, memory and disk use) as statistics.
+- There is a simple webapp to allow browsing of the counters. This is
+ in turn served on the HTTP interface, at the root. It uses
+ Dygraph to draw simple graphs, querying for JSON counter data from
+ storage.
+- Retention is configurable only globally (for all new counters created).
+- Flush rate is "at least once every 5 minutes" although buckets are
+ aggregated over 10 seconds with the default settings.
+- There exists a command line tool to dump the counter data in a given
+ individual counter file (from disk) to csv (comma separated values)
+ format.
+- If you lose the machine, the data on the disk is always consistent in
+ the sense that each bucket except possibly the currently active bucket
+ page is in a good state.
+- If you lose the disk, you hopefully had already set up the system with
+ live replication :-)
+- The replication functionality will attempt to re-connect to the target
+ system with exponential back-off, buffering data it receives in the
+ meanwhile. However, if the daemon is then shut down, that buffered data
+ is lost from the point of view of replication (it will still live on the
+ local disk).
+- Some counters will want to aggregate. For example, CPU idle, kernel and
+ user times will add up to 100%, so they could generally be plotted together
+ in a single graph. This is solved in two ways:
+ - For a counter with multiple levels; you can configure global
+ aggregation up the tree for a few levels. So, for example, for a
+ counter with the name cpu.idle.hostname, the sample will be aggregated
+ into both cpu.idle.hostname, and cpu.idle.
+ - For more advanced counter aggregation, you can use the caret format of
+ counter names. The counter cpu.idle^host.a^class.b^type.c will aggregate
+ into counters named cpu.idle.host.a, cpu.idle.class.b and cpu.idle.type.c.
+ You typically want to specify these using --localstat, for example.
+- Counters that start with "*" are treated as events -- count number of events
+ per second. These are aggregated over the shortest retention interval; longer
+ retention intervals treat the event rates as gauges with a fixed count equal
+ to the number of seconds in the bucket.
+- StatFile scales at least 10x better than a SQL database, perhaps much better.
+ For large clusters (IMVU currently has 800 motherboards in production) this
+ constant factor really matters.
+- There is a simple key/value interface that supports storing JSON-style keys
+ mapping to strings (only) given different names. This is used to support per-
+ user personalization, and storing saved dashboards of counters. That system
+ is not intended to be scalable beyond a few hundred keys per container, and
+ a few hundred containers. Nor is it intended to support a constant churn of
+ key/value updates. If you need that, use Redis!
+
+Build stuff
+-----------
+
+Just doing "make" should build on a modern Linux machine with dependencies
+installed. Nothing else is supported right now. If other things become
+supported, they will do so without the use of autotools, for religious reasons.
+
+You will likely want to do something like:
+ sudo apt-get install libboost-all-dev
+
+The libstatgrab library is available only through source download. Configure
+it and make + make install (into /usr/local is fine).
+
+g++ and GNU make are needed, too.
+
+The make file is automatic:
+
+- Anything in lib/ named .cpp gets included into libistat.
+- Headers for those things should live in include/istat (this will
+ simplify later installation builds, if appropriate).
+- Anything in daemon named .cpp gets included into the bin/istatd executable.
+ Headers just live in daemon as well.
+- Anything in tool/ named .cpp gets built as a separate command-line tool,
+ with the libistat library linked (as well as boost dependencies etc).
+- Anything in test/ named .cpp gets built as a separate command-line program,
+ and executed as part of the make. This ensures that unit tests pass.
+- Unit tests are not as comprehensive as could be, and the daemon is not as
+ factored into testable parts as could be. That being said, it is a lot
+ better than nothing.
+- The build uses one small lib with a "public" API (libistat) and one large
+ lib with everything else except main.cpp. The reason for the second lib is
+ largely to support unit testing.
+- Dependencies (included files, outputs) are automatically tracked, so
+ toucing a .h file will automatically re-build the necessary .cpp files etc.
+- Files in ftest named test_* are run as functional tests at the end of a
+ build. These make use of a common set of functions in ftest/functions and
+ also a common set of configurations in ftest/*.cfg
+
+Because the daemon may want to open privileged files for local stats, bind
+to ports that may be privileged, and update the number of files that may be
+opened at one time (important!), it wants to run as suid root. It will drop
+privileges after starting up. However, as the file gets re-written each time
+it is built, a script named suid.sh is called after linking if it exists and
+is executable. This will attempt to sudo chmod the output file -- prepare to
+enter your password, or abort.
+
+There are scripts to build a dpkg in the "debian" directory. Other packaging
+systems might be accepted as community contributions. Hint, hint ;-)
+
+Again, for more detailed documentation, see the Github wiki!
+
+BUGS
+----
+
+- Istatd currently does not automatically shard across hosts for storage.
+ Once we hit sufficient size, we should write a gateway that works as
+ shard distributor for both incoming stats and queries.
+- We have seen "Going back in time" errors from lib/StatFile.cpp for local
+ stats -- how is that possible? (This was a long time ago -- we may have
+ fixed it, but something to keep an eye on.)
+- There are a nearly unlimited number of features and enhancements we would
+ want to add, such as support for discrete "events" and support for strongly
+ consistent multi-host replication.
+- The UI is functional and looks good, but we could always have more display
+ modes, such as stacked, min/max, etc.
+
View
@@ -0,0 +1,16 @@
+
+
+How to test the individual pieces
+=================================
+
+Generate a file with a bunch of data:
+
+ bin/random_numbers -t 100000 1000 | bin/istatd_nums2file /tmp/file.cf -t 10
+
+Dump the file to CSV format (calculating sdev etc):
+
+ bin/istat_filedump /tmp/file.cf
+
+Get information about a counter file (checking header fields):
+
+ bin/istat_fileinfo /tmp/file.cf
View
@@ -0,0 +1,7 @@
+#!/bin/bash
+
+sudo rm /var/log/istatd.log
+make OPT=
+sudo /etc/init.d/istatd stop
+sudo cp bin/istatd /usr/bin/istatd
+sudo /etc/init.d/istatd start
Oops, something went wrong.

0 comments on commit eabe791

Please sign in to comment.