The equjoin currently in dev is the basic one. Doesn't scale well when one key is predominant, doesn't exploit special cases like one side having a small number of records. There is lot of work that could go into having a better join feature.
See the paper "processing theta joins in mapreduce"
One possible technique is to do a preliminary job to create a bloom filter with the keys of one or both sides of the join, then perform the join using the bloom filter as, indeed, a filter in the map phase. Adds jobs but moves work to the map side (reportedly faster in real instances)