Permalink
Cannot retrieve contributors at this time
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
210 lines (199 sloc)
13.5 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <HTML> | |
| <HEAD> | |
| <META HTTP-EQUIV="content-type" CONTENT="text/html;charset=iso-8859-1"> | |
| <META NAME="generator" CONTENT="GoLive CyberStudio"> | |
| <TITLE>W e b w a t c h [HELP]</TITLE><!-- Concept & HTML Authoring : Pino De Luca: pino@dad.be--> | |
| </HEAD> | |
| <BODY BGCOLOR="#FFFFFF" TEXT="#000010" LINK="#0031CE" ALINK="#FF0000" VLINK="#CE3100"> | |
| <CENTER> | |
| <P><!-- BEGIN OF HEADINGS --------------------------------------------------> | |
| <TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 WIDTH=608> | |
| <TR> | |
| <TD WIDTH=10> </TD> | |
| <TD WIDTH=68><!-- HITWATCHERS --> | |
| <IMG SRC="http://www.hitwatchers.com/source/webtotal.gif" WIDTH=1 HEIGHT=1 BORDER=0 ALIGN="left"> <!-- --> | |
| <!-- DAD MONITOR --> | |
| <IMG SRC="http://194.78.47.32/webwatch.gif" WIDTH=1 HEIGHT=1 BORDER=0> <!-- --> | |
| </TD> | |
| <TD COLSPAN=2 VALIGN="TOP" ALIGN="RIGHT"><A HREF="index.html" ONMOUSEOVER="window.status ='Home';return true"><IMG SRC="http://www.webwatch.be/images/logoWW2.gif" ALT="[ W E B W A T C H ]" BORDER=0 WIDTH=142 HEIGHT=47></A></TD> | |
| <TD WIDTH=23 BGCOLOR="#FEC500" VALIGN="TOP"><IMG SRC="http://www.webwatch.be/images/Hmail.gif" WIDTH=23 HEIGHT=47></TD> | |
| <TD BGCOLOR="#FEC500" WIDTH=345> | |
| <DIV ALIGN="right"><B><FONT FACE="arial,helvetica" SIZE=5 COLOR="#FFFFFF">how to use WebWatch</FONT></B><FONT FACE="arial,helvetica" SIZE=5 COLOR="#FFFFFF"> </FONT> | |
| </DIV></TD> | |
| </TR> | |
| <TR> | |
| <TD WIDTH=10 BGCOLOR="#FF3100"> </TD> | |
| <TD COLSPAN=2 BGCOLOR="#FF3100"> </TD> | |
| <TD BGCOLOR="#FF3100" COLSPAN=3><B><FONT FACE="arial,helvetica" COLOR="#FFFFFF">Internet in Belgium</FONT></B></TD> | |
| </TR> | |
| </TABLE> | |
| <TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0> | |
| <TR> | |
| <TD><A HREF="index.html" ONMOUSEOVER="window.status ='Directory';return true"><IMG SRC="images/directory.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="http://crawler.webwatch.be/crawler.acgi" ONMOUSEOVER="window.status ='Crawler';return true"><IMG SRC="images/crawler.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="/chat/default.html" ONMOUSEOVER="window.status ='Chat';return true"><IMG SRC="images/chat.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="most/index.html" ONMOUSEOVER="window.status ='Most wanted';return true"><IMG SRC="images/most.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="mail/index.html" ONMOUSEOVER="window.status ='By Mail';return true"><IMG SRC="images/mail.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="/news/" ONMOUSEOVER="window.status ='News';return true"><IMG SRC="http://www.webwatch.be/images/News.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="week.html" TARGET="_top" ONMOUSEOVER="window.status ='Site of the Week';return true"><IMG SRC="images/week.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="http://www.webwatch.be/users/" ONMOUSEOVER="window.status ='Community';return true"><IMG SRC="http://www.webwatch.be/images/comm.gif" WIDTH=41 HEIGHT=41 BORDER=0></A> <A HREF="recent.dlp" ONMOUSEOVER="window.status ='Recent Sites';return true"><IMG SRC="images/recent.gif" WIDTH=41 HEIGHT=41 BORDER=0></A></TD> | |
| </TR> | |
| </TABLE> | |
| <BLOCKQUOTE> | |
| <P><!-- END OF HEADINGS --------------------------------------------------> | |
| <FONT FACE="arial,helvetica" SIZE=5>How to use Webwatch ?</FONT> | |
| </CENTER> | |
| <P><B><FONT FACE="arial,helvetica">Webwatch</FONT></B><FONT FACE="arial,helvetica"> is a unique search engine that allows you to find Belgian Internet | |
| sites in two ways: by a subject index or a keyword search.</FONT> | |
| <P> | |
| </BLOCKQUOTE> | |
| <P><FONT FACE="arial,helvetica"><TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 WIDTH="100%"> | |
| <TR BGCOLOR="#FFCC00"> | |
| <TD><B> Index or crawler ?</B></TD> | |
| </TR> | |
| </TABLE> | |
| </FONT> | |
| <BLOCKQUOTE> | |
| <P><FONT FACE="arial,helvetica">The <B>Index</B> groups approximately 7800 (see <A HREF="status.html">current status</A>) sites according to subject categories, and includes short descriptions | |
| of the sites. It is best used to find a variety of sites that | |
| treat the same subject or to browse through different subjects | |
| to find new and interesting sites. New sites are added by their | |
| creators who wish to alert the public of their site's existance, | |
| or by the Webwatch team. The Index is compiled by our staff at | |
| DAD who check the URL's, the descriptions, and the subject headings | |
| for accuracy. The Index also contains sections devoted to 'New' | |
| sites, 'Cool' sites and 'Must' sites. New sites are sites added | |
| in the past ten days. Cool sites are sites of superior content | |
| or quality picked by the Webwatch team. Must sites are generally | |
| business related, of use to the Internet community. </FONT> | |
| <P><FONT FACE="arial,helvetica">For example, if you are interested in finding information about | |
| automobiles, use the index by clicking on the subject 'Business | |
| and Economy / Cars'. All of the sites that have been submitted | |
| to Webwatch that deal with automobiles will be listed with short | |
| descriptions. </FONT> | |
| <P> | |
| </BLOCKQUOTE> | |
| <P><FONT FACE="arial,helvetica"><TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 WIDTH="100%"> | |
| <TR BGCOLOR="#FFCC00"> | |
| <TD><B> Optimal use of the crawler ...</B></TD> | |
| </TR> | |
| </TABLE> | |
| </FONT> | |
| <BLOCKQUOTE> | |
| <P><FONT FACE="arial,helvetica">The <B>Crawler</B> is an automated robot that continuously searches the Internet | |
| for new Belgian sites, or sites of Belgian interest. It methodically | |
| indexes the contents of the sites, finding keywords that can be | |
| used for searches, and discovering new links to search. The crawler | |
| is best used to find combinations of words that occur in the same | |
| page, allowing very detailed searches for specific information. | |
| There are many more sites contained in the Crawler's database | |
| than in the Index. </FONT> | |
| <P><FONT FACE="arial,helvetica">The key word search automatically looks for all of the words in | |
| the search field. If you type 'sports cars' into the search box, | |
| the crawler will automatically look for all of the pages it has | |
| indexed that contain BOTH words. The crawler does not differentiate | |
| between capital letters or letters with accents. E, é, è, ê, and | |
| e are all considered the same letter: e. </FONT> | |
| <P><FONT FACE="arial,helvetica">You can change the search options by clicking on the 'options' | |
| link. Here, you may: </FONT> | |
| <UL> | |
| <LI><FONT FACE="arial,helvetica">specify whether you would like to search for ALL of the words | |
| in the search field or ANY of the words in the search field. </FONT> | |
| <LI><FONT FACE="arial,helvetica">specify whether the crawler will return 10 hits or 25 hits. </FONT> | |
| <LI><FONT FACE="arial,helvetica">specify detailed results or summary results. Summary results display | |
| the titles of the sites found. Detailed results display the title, | |
| file size, first three lines of text, and URL, allowing a selection | |
| to be made from among the sites found without having to visit | |
| each address. </FONT> | |
| <LI><FONT FACE="arial,helvetica">specify that you are searching for the beginning of a word. By | |
| typing 'auto' and specifying this option, the crawler will find | |
| not only 'automobile' but 'autodidactic' and 'automaton'. </FONT> | |
| <LI><FONT FACE="arial,helvetica">specify sites entered within the past day, week, month, or last | |
| update. This option can make repeat searches more effective. </FONT> | |
| <LI><FONT FACE="arial,helvetica">specify searches uniquely by URL, Title, or Headers. </FONT> | |
| </UL> | |
| <P><FONT FACE="arial,helvetica"><TABLE BORDER=0 CELLPADDING=1 CELLSPACING=0> | |
| <TR BGCOLOR="#FFCC99"> | |
| <TD><B>Relevance Ranking</B></TD> | |
| </TR> | |
| </TABLE> | |
| </FONT> | |
| <P><FONT FACE="arial,helvetica">When Webwatch parses a file during indexing it notes the location | |
| and frequency of keywords that it encounters. Words in particular | |
| HTML tags are "weighted" heavier. An unweighted word, which would | |
| be a word that does not appear in any of the weighting tags, earns | |
| a value of 1 for each occurrence on the page. Words in the following | |
| tags get additional weighting as follows: </FONT> | |
| <P><FONT FACE="arial,helvetica">Words in the TITLE tag are weighted as 10 times heavier than an | |
| unweighted word Words included as META KEYWORDS are also weighted | |
| 10 times heavier than unweighted words Words found in one of the | |
| header tags are weighted 5 times heavier than unweighted words | |
| Words contained in a hyperlink are weighted 3 times heavier than | |
| unweighted words </FONT> | |
| <P><FONT FACE="arial,helvetica">If the word "Gretzky" appears once in the title, twice in headers, | |
| and 6 times throughout the rest of the page it would receive a | |
| total weighting of 26 (1x10 + 2x5 + 6x1). </FONT> | |
| <P><FONT FACE="arial,helvetica">This weighting is used to determine relevance. If you search for | |
| "Gretzky", the above page would return a weighting of 26. If there | |
| are other pages with the word "Gretzky" they will be ranked according | |
| to the same scoring system. The higher the number, the higher | |
| the relevance ranking. </FONT> | |
| <P><FONT FACE="arial,helvetica"><TABLE BORDER=0 CELLPADDING=1 CELLSPACING=0> | |
| <TR BGCOLOR="#FFCC99"> | |
| <TD><B>Search Tips</B></TD> | |
| </TR> | |
| </TABLE> | |
| </FONT> | |
| <P><B><FONT FACE="arial,helvetica">Numbers</FONT></B><FONT FACE="arial,helvetica"> </FONT> | |
| <P><FONT FACE="arial,helvetica">Numbers indexing can be turned on or off in Webwatch by the robot | |
| administrator. Turning numbers off makes Webwatch indexes smaller. | |
| It may, therefore, not be possible to search for numbers appearing | |
| in a document. </FONT> | |
| <P><B><FONT FACE="arial,helvetica">Noise Words </FONT></B> | |
| <P><FONT FACE="arial,helvetica">Webwatch does not include common words such as "also", "been" | |
| and "there" in the index. These words are ignored when searching. | |
| If one or more is used in a search a message will be displayed | |
| on the results page indicating which words were noise words. The | |
| administrator can edit the list of noise words; see the Reference | |
| Guide for more information. </FONT> | |
| <P><B><FONT FACE="arial,helvetica">Punctuation</FONT></B><FONT FACE="arial,helvetica"> </FONT> | |
| <P><FONT FACE="arial,helvetica">Punctuation is not indexed in Webwatch, therefore all punctuation | |
| is removed from a search string. This includes the "@" character | |
| which will affect e-mail addresses. The e-mail address "wayne@greatone.com" | |
| will be indexed in Webwatch as three separate words: "wayne", | |
| "greatone" and "com". A search for "wayne@greatone.com" will therefore | |
| become a search for the words "wayne", "greatone" and "com", and | |
| will find pages with The Great One's e-mail address on it. The | |
| same applies for hyphenated words such as "CD-ROM". </FONT> | |
| <P><B><FONT FACE="arial,helvetica">Word Size</FONT></B><FONT FACE="arial,helvetica"> </FONT> | |
| <P><FONT FACE="arial,helvetica">Webwatch does not index words with 2 or fewer letters. Such words | |
| will be ignored in a search. </FONT> | |
| <P><B><FONT FACE="arial,helvetica">Accented Characters</FONT></B><FONT FACE="arial,helvetica"> </FONT> | |
| <P><FONT FACE="arial,helvetica">Webwatch supports special characters with ISO Latin encoding, | |
| however, it does not index characters with accent marks. Accented | |
| characters are indexed without the accent. Webwatch also converts | |
| HTML entities into the base character for inclusion in the index. | |
| </FONT> | |
| <P><FONT FACE="arial,helvetica">It is important to note that for searching the following characters | |
| are all considered equal: </FONT> | |
| <P><FONT FACE="arial,helvetica">e, E, é (&eacute;), É (&Eacute;), ë (&euml;), etc. </FONT> | |
| <P><FONT FACE="arial,helvetica">So a search for "Café" will also find "Cafe", "CAFE", "Cafë", | |
| etc. Searching for "Caf&eacute;" will not work. </FONT> | |
| <P> | |
| </BLOCKQUOTE> | |
| <P><FONT FACE="arial,helvetica">The Internet is an evolving, living network. Although we try to | |
| keep the system accurate and timely, URLs contained in the Crawler | |
| and Index may not work if the server is down or offline, if the | |
| site has been removed, or if the site has been moved to another | |
| location. If you have problems accessing a site found on Webwatch, | |
| please alert us by sending a note to <A HREF="mailto:lips@dad.be">lips@dad.be</A>. </FONT> | |
| <CENTER> | |
| <P><FONT FACE="arial,helvetica"><!-- BEGIN OF SMALL NAVIGATION --------------------------------------------------> | |
| <HR> | |
| <TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 WIDTH=608> | |
| <TR> | |
| <TD WIDTH=121><IMG SRC="images/engines.gif" ALT="[ MORE ENGINES ]" ALIGN="left" WIDTH=19 HEIGHT=17> <FONT FACE="arial,helvetica" SIZE=2><A HREF="Search-engines.html">More engines</A></FONT></TD> | |
| <TD WIDTH=121><IMG SRC="images/status.gif" ALT="[ CURRENT STATUS ]" ALIGN="left" WIDTH=19 HEIGHT=17> <FONT FACE="arial,helvetica" SIZE=2><A HREF="status.html">Current status</A></FONT></TD> | |
| <TD WIDTH=131><IMG SRC="images/Sadd.gif" ALT="[ ADD YOUR SITE ]" ALIGN="left" WIDTH=19 HEIGHT=17> <FONT FACE="arial,helvetica" SIZE=2><A HREF="add.html">Add your site</A></FONT></TD> | |
| <TD WIDTH=111><IMG SRC="images/help.gif" ALT="[ HELP ]" ALIGN="left" WIDTH=19 HEIGHT=17> <FONT FACE="arial,helvetica" SIZE=2><A HREF="Help.html">Help</A></FONT></TD> | |
| <TD><IMG SRC="images/about.gif" ALT="[ ABOUT WEBWATCH ]" ALIGN="left" WIDTH=19 HEIGHT=17> <FONT FACE="arial,helvetica" SIZE=2><A HREF="Press.html">About WebWatch</A></FONT></TD> | |
| </TR> | |
| </TABLE> | |
| <!-- END OF SMALL NAVIGATION ----------------------------------------------------> | |
| </FONT> | |
| </CENTER> | |
| </BODY> | |
| </HTML> |